Access's Yes/No Field=1 bit, VBA's boolean type=2 bytes?

The_Doc_Man · Jul 13, 2008

datAdrenaline, I think I would reverify the reference date for Excel vs. Windows. Since Excel uses the system clock to "do its thing" the two should be the same.

If SQL server and JET use different numbers, ... I wish I could say I'm surprised. My Alphas running OpenVMS use 17 Nov 1868 as day 0. (It's called "Smithsonian" time.) A rose by any other name would smell as sweet.

Regarding the different tests and whether something was compared to zero or not, it isn't QUITE as simplistic as that. In the final analysis, for a cast, you are very right, they are all compared to zero and it is either zero (FALSE) or not zero (TRUE). BUT the problem is ... WHICH BIT is compared to zero? If it ain't "packed" then you need to know the language's definition of a Boolean. Which is why I made all that hoo-raw about the different instructions.

Is it that the sign bit is the one that counts? Then you use the instructions for less/greater - and the other bits don't count.

Is the least significant bit that counts? Then you use the tests for odd/even.

Is it the whole byte? Then you could use a test for zero/not zero.

BUT the language specification tells you how the test works. Not the hardware. The only thing the hardware does for you is make it easier or harder to implement the language. In Ada, for example, you can look this up under the ANSI standard specification for data representation. FORTRAN, BASIC, Pascal, C, ... each has its standard. And it is OK for them to be different.

Now, Banana, you commented above about knowing when one of those little "casts" occurs. Well, actually it is easy in VBA under Access. "Option Explicit" will tell you when you are about to create a new variable. Then all that is left is to examine anything that involves a known cast variable.

Take that case where you did CDBL(1/24) instead of CDBL(1)/CDBL(24). If that was actually supposed to be a time, that failure to correctly pre-cast the time cost you 33 seconds (roughly), which is the point at which 1/24 as a SINGLE truncates a date/time number for contemporary dates. That is a quick-and-dirty approximation so don't hold my feet to the fire if you think I'm off by a couple of seconds.

Banana · Jul 13, 2008

Doc Man, you are correct about Option Explicit, but I still find it a bit long in tooth.

For example, in VB.NET, there is a Option Strict, which basically forbade any implicit datatype, so 1/24 (as integer) + date () (as double) will be rejected by the compiler with an error, and require you to explicitly typecast as cdbl(1/24) + date().

You also probably already know that Ada is among the most anal-retentive language out there regarding typecasting, which is very good thing for mission critical applications. I certainly wouldn't want to toggle a nuclear missile launch because there was an implicit typecast that I didn't know about!!

Implicit typecasting are not inherently bad in themselves- they were provided to make our job easier and remove the boring, repetitive tasks of checking variables' type at write time when compiler could very well handle it. I would be just equally happy if the compiler would at least tell me that it'll have to do a implicit typecast on a line after I pressed return... perhaps as a auto-magic comment to right of my recently inserted line of code so I know a bit about the background.

This is especially more true when we consider that Access is more likely to be developed by someone who isn't a full-time programmers with formal education in designing and deploying software solution.

Regarding date/time type, I think this is a good point- it would be very bad idea to get raw binary representation and add it with another binary representation because that would mean we're assuming the epoch is same. I think this illustrate the problem very well:

ChrisO · Jul 13, 2008

G’day Banana.

>>Regarding date/time type, I think this is a good point- it would be very bad idea to get raw binary representation and add it with another binary representation because that would mean we're assuming the epoch is same. I think this illustrate the problem very well:<<

I doubt if the above is correct, but please let me try to give an example.

We would not be using the raw binary representation of a date/time but rather the IEEE 64 bit floating point equivalent as a number of days…no reference point is implied within the so called Access Date/Time data type.

It’s not a Date/Time it is a number of days.

Example; if we put 1.5 in a Date/Time field it is purely 1.5 days, 36 hours +/- ~ 100 nS.
There is nothing in the Date/Time data type to suggest that it is a Date/Time.
As has been said, it derives its Date/Time from the number of days as an offset to day zero(0).

But…

The Date/Times can cause problems (please see the following attachment).
When you open the attachment it should look fine…note Midnight to 13:00 hours.

Now close it, switch Regional and Language Settings to Afrikaans.

Restart the program and note Midnight to 13:00 hours.

How do we fix this? Next post please.

Regards,
Chris.

ChrisO · Jul 13, 2008

G’day Banana. II

Now for the fix.

Scrap internal Date/Times and use straight doubles.
Apart from this method, avoiding Regional and Language Settings (apart from where needed for display) it also allows increasing the resolution to beyond one second.

Please note in the attachment that all Date/Time fields have been removed and replaced with straight doubles. All Date/Time calculations are done in straight doubles.

And it works in all Regional and Language Settings.
(I also think it works in all natural language versions of Access, but that has never been proved. Way too difficult.)

Regards,
Chris.

Banana · Jul 13, 2008

ChrisO said:
G’day Banana.

>>Regarding date/time type, I think this is a good point- it would be very bad idea to get raw binary representation and add it with another binary representation because that would mean we're assuming the epoch is same. I think this illustrate the problem very well:<<

I doubt if the above is correct, but please let me try to give an example.

We would not be using the raw binary representation of a date/time but rather the IEEE 64 bit floating point equivalent as a number of days…no reference point is implied within the so called Access Date/Time data type.

Now this shows how I have to quit being fast and loose if I am to make any sense. Please allow me to rephrase.

By raw binary representation, I was actually referring to floating number as specified by IEEE. As for the refernece point, the problem I perceived to be this. Suppose a company had offices in different time zones but for recordkeeping purposes uses just one zone. Now local office computers will be of course set to local time zone. Therefore if we passed a double representation, it would be incorrect because the server will interpret it as occuring x hours after/before the time zone. But if we passed off a string of date time, the server can then cast it to correct double representation.

Here this is where implicit typecasting is a good thing. However the devil is the ignorance of whether there was an implicit typecasting which will then manifest itself as a bug.

Hope this clears things a bit. Will take a look at your attachment when I have acess to Access. (Sorry, bad pun.)

datAdrenaline · Jul 14, 2008

Hello The Doc Man ...

>> datAdrenaline, I think I would reverify the reference date for Excel <<

The date in Excel (I'm talking the spreadsheet world here ... not VBA for Excel) is as I have stated. Day 1 = 1/1/1900, there is no day 0. If you type out 12/31/1899 in a cell, you will see that it becomes left aligned, indicated text, where as dates are normally right aligned (assuming default settings everywhere) ... Also, if you use DATE() function (which is Analogous to VBA's DateSerial) in a cell of an Excel workbook with:
Date(1899,12,31) ... the returned result (numerically) is 693962, which when formated as a date is 12/31/3799 ... but if you make cell A1 1/1/1900, then make cell A2 the expression A1 -1, you will get 1/0/1900, which is kinda off if you ask me! ... basically all this is to show ... Excel workbooks do not understand a date serial of 0, or any date less than 1/1/1900.

{Note: this is for Excel for windows using the 1900 date system, not Mac with 1904. One more note. VBA for Excel has a day 0 of 12/30/1899, just as all other VBA installations do ... interesting that VBA and a workbook have a different base date for day 0!! ... Which leads to your statement ...

>> If SQL server and JET use different numbers ... I wish I could say I'm surprised. <<

The base dates that corelate to 0 are indeed different between JET and SQL Server, they are as I indicated above.

... Gotta go ... still painting!! ...

gemma-the-husky · Jul 14, 2008

Doc Man

re the base date on VMS Machines

A friend once told me it was the birth date of Charles Babbage, I think.

The_Doc_Man · Jul 15, 2008

Gemma, the VMS documentation says otherwise about base date. Supposedly it was some internationally agreed-upon transit of some planet that EVERYONE with a telescope observed and noted the time.

Again, got to be contentious, but watch out for dates. If you put 1.5 in a DATE/TIME variable type, that IS a calendar date. It is 1 day 12 hours after the reference date. Now, if you did that to a DOUBLE then I agree it is a raw number. But the thing is, once you declare it as date/time, then Access automatically and forever formats it as a date and time. Because it carries "baggage" with it. Hmmm... since old Charlie was one of the fathers of mechanical computing, would that become Babbage Baggage? Can you even say that three times fast? But I digress...

Here's an experiment for you. Create a record with a date/time field, do not specify a particular format for display. Just take defaults. Add 7 hours. Examine it by query. Add another 7 hours. Examine it again. OK, let's take a BIG leap. Add 14 hours. NOW check the value. By default, that number is displayed as 2 Jan 1900 at about 0600 military time. And that is because even though a DATE/TIME is merely a cast of a DOUBLE, that cast is recorded in the field descriptor for that record and you ALWAYS get a date/time format unless you do an explicit cast. If it were a "real" double, you would merely have 28 hours expressed as days and fractions.

gemma-the-husky · Jul 16, 2008

clearly my friend misled me

i just looked babbage up, and he was born in 1791!

ChrisO · Jul 16, 2008

G’day Doc.

Well, since I mentioned the 1.5 days thing, I guess your reply was partially directed towards me.

This part I don’t understand >>But the thing is, once you declare it as date/time, then Access automatically and forever formats it as a date and time.<<

Agreed that Access always formats it as a date/time but does that necessarily mean it needs anything stored within the table in order to do so? It may be that, simply because it’s in a date/time field that Access formats it as a Date/Time for display whenever needed.

For example, if we have formatting information stored with the date/time offset information (the "baggage" to which you refer) then we should expect to see a formatted date/time in some particular set format. Yet if we switch Regional and Language Options the format changes. This would indicate that the formatting is done at time of display and that would also indicate we do not need any "baggage” saved with the raw data in order to do so.

It would also mean that the "baggage" would need to be stored in each date/time field.

------------

So I ran the following test: -

In A2K3, two databases, one table in each with 131069 records…
Db1.mdb Date/Time field defaulted to Date only of 16-Jul-2008 --- 2188 KB
Db2.mdb Converted the above to Double field 39645 (16-Jul-2008 ) --- 2188 KB

This would indicate that there is no formatting "baggage" within the table with the Date/Time field.
If there is no formatting "baggage" within the date/time field then there is only the raw double data and that data is simply a number of days… not a date/time.

The test databases are attached.

Regards,
Chris.

The_Doc_Man · Jul 16, 2008

You missed one possiblity - that the baggage, though different in content for each, was the same SIZE in each of your test cases.

OK, let's get ethereal here. A display on a screen doesn't exist any longer than the time that its window (parent, child, adjunct, or pick-your-favorite) is open. Looking at the size of undisplayed data proves nothing. But it gets crazier than that. The window in which you see a form, report, or table display is the same size regardless of which of those sources is used. If a window is 1000 bytes by 1000 bytes (1 Mb) then that is its size regardless of what is painted on the window. So even dynamically, size doesn't matter. (At least, that's what guys keep on saying...)

Another thing you need to know is that in the tabledef for your table, the fielddef collection shows either DATE/TIME or DOUBLE as the field type - both of which take up exactly the same amount of space because a type field is one byte. This is, of course, METADATA, so you can't see it quite so easily. But if you checked the size of the field in a disk record, or looked at the size of the window, or looked at its descriptive data, they WOULD be the same. So your experiment didn't prove anything. The ONLY difference to be seen isn't size but a more evanescent property - the "marks" made in the window when that item is displayed. But for windows using bit-raster format, the marks make no difference to anything except the display driver.

Now, here is the issue that you are looking at and not seeing. Look at this like a computer, not like a person.

Access does everything it possibly can DYNAMICALLY. Remembering that a date/time field is a CAST of a double float field, you would expect them to statically take up the same amount of space. Remembering that the fielddef for those fields uses one byte that is EITHER a DATE/TIME or a DOUBLE value-code, both of which take up the same amount of space, you would again expect them to statically take up the same amount of space. And there is your problem - you are looking at them statically. But hell, they are no fun that way. Look at them DYNAMICALLY.

Access is a really great example of a data-driven environment. (In this case, to include METADATA as part of the driving.) When Access is loaded into memory and you actually CALL UP something that includes one or the other of those field types, the buffer allocated to the window will be modified one way if you express the field as a date-time and the other way if you express it as a double.

Why does the date/time format change? Because Access dynamically determines the formatting if the format code for the date/time field is one of the "default system date" options - of which there are several. And the default for date/time fields is itself a field in the database preferences (which is part of an internal system table that is very well hidden.) Further it depends on something that is outside of Access in many cases - the Windows date standard of choice. Not even a part of Access.

Therefore, here is where we have the viewpoint problem. Dynamically, a date/time is TREATED differently than a double. Statically, neither is treated at all.

When I say that a date/time field is forever a date/time field, I'm actually applying the Heisenberg principle and a little bit of faith. In order to observe something to determine its state, you must interact with it. If it isn't participating in an interaction, we have the "tree falls in the forest" conundrum.

If a date/time field is expressed on a computer but nobody is there to see it, is it even there? (Egad, getting deep in this one... pardon me while I go to another question and let my brain cool off.)

ChrisO · Jul 17, 2008

Too many words Doc.

My assertion is that the data stored in a date/time field is simply a number of days in floating point format and it is exactly the same format as a double. It is a length of time which only becomes a point in time by adding the day zero.

If you have a different opinion then that’s fine, I can live with it.

Regards,
Chris.

datAdrenaline · Apr 21, 2011

Yes ... I know this is an old thread, but a discussion over at UtterAccess between Banana and myself had prompted a re-visit to this thread. In short, if you don't want to read it all, its seems that Banana and I were able to conclude (in separate tests) that Access's Yes/No datatype (Jet/ACE BOOLEAN or BIT) datatype does not appear to consume a Byte, as I had stated, and supported with a test, earlier in this thread. It seems that the storage space consumed by Jet/ACE's Boolean datatype is a BIT.

Here is a copy/paste from my UA post supporting this. The UA thread contains another post from Banana supports the same thoughts.

-------

Different test for the storage size of the Yes/No (BIT field)...

Here is my thinking. A single record has a maximum byte size right? .. well, my thought was to pack a record full of data until an error occurred on insert. So I started out creating a table with Text(255) (no unicode compression) fields and increased the field count untill an insertion (all fields set with a 255 character string) raised an error.

Then I would drop the field count back one, and continue to add fields using the Currency data type (8 bytes) until an error occurred on insert...
Then I would drop the field count back one, and continue to add fields using the Number/Long data type (4 bytes) until an error occurred on insert...
Then I would drop the field count back one, and continue to add fields using the Number/Integer data type (2 bytes) until an error occurred on insert...
Then I would drop the field count back one, and continue to add fields using the Number/Byte data type (1 byte) until an error occurred on insert...
Then I would drop the field count back one, and continue to add fields using the Yes/No data type (? bytes) until an error occurred on insert...

Following that thought process, if Boolean (Yes/No) was stored as a Byte, then I would not be able to add any Yes/No fields at the Byte to Boolean transition.

Here is the code I used to implement the experiment. This is the final version, and each For..Next loop was added as the experiment progressed.

Code:

Sub TestBooleanStorageSize()
 
    Dim db As DAO.Database
    Dim tbl As DAO.TableDef
    Dim fld As DAO.Field
 
    Set db = CurrentDb
 
    DoCmd.DeleteObject acTable, "tblByteBitTest"
 
    Set tbl = db.CreateTableDef("tblByteBitTest")
 
    With tbl
        Dim x As Integer
 
        '7*255*2 = 3570 total bytes of data successfully written
        For x = 1 To 7  'When only this loop and error on insert was raised at 8
                        'so error with 4080 bytes of data being inserted.
            Set fld = tbl.CreateField("Field" & x, dbText, 255)
            fld.DefaultValue = String(255, "~")
            tbl.Fields.Append fld
        Next x
 
        '57*8 = 456 bytes (4026 total bytes of data successfully written)
        For x = 8 To 64 'When this and above loop and error on insert was raised at 65,
                        'so error with 4034 bytes of data being inserted.
            Set fld = tbl.CreateField("Field" & x, dbCurrency)
            fld.DefaultValue = 0
            tbl.Fields.Append fld
        Next x
 
        '1*4=4 bytes (4030 total bytes of data successfully written)
        For x = 65 To 65 'When this and above loop and error on insert was raised at 66
                         'so error with 4034 bytes of data being inserted.
        Set fld = tbl.CreateField("Field" & x, dbLong)
            fld.DefaultValue = 0
            tbl.Fields.Append fld
        Next x
 
        'Could not add a field of dbInteger, so error with 4032 bytes of data being inserted.
 
        '1*1=1 bytes (4031 total bytes of data successfully written)
        For x = 66 To 66 'When this and above loop and error on insert was raised at 67
                         'so error with 4032 bytes of data being inserted.
            Set fld = tbl.CreateField("Field" & x, dbByte)
            fld.DefaultValue = 0
            tbl.Fields.Append fld
        Next x
 
        '6*1=6 bits (4031 bytes and 6 bits -- 4031.6 -- of data successfully written)
        For x = 67 To 72 'When this and above loop and error on insert was raised at 73
                         'so error with 4031.7 bytes of data being inserted.
            Set fld = tbl.CreateField("Field" & x, dbBoolean)
            fld.DefaultValue = 0
            tbl.Fields.Append fld
        Next x
 
    End With
 
    db.TableDefs.Append tbl
 
    Dim strValue As String
    strValue = 0 'String(255,"~") when last field was Text(255)
    db.Execute "INSERT INTO tblByteBitTest (Field" & x - 1 & ") VALUES ('" & strValue & "')"
 
End Sub

Conclusions ...

- The data size of a record seemed to be < 4032 bytes in this test since every attempt to insert with data adding up to 4032 bytes or more failed.

- Boolean IS NOT THE SAME SIZE AS Byte ... it seems to be a BIT!!! If Byte and Boolean were the same size, I would NOT have been able to add ANY Boolean fields after I maxed out the Byte fields, but I was able to add 6 more fields using the Boolean datatype.

----

So ... after all these years and re-visiting and challenging our thoughts AND experiments --- it seems that JET's Boolean consumes of a BIT instead of a Byte. In the transition to VBA, I still think its an "All Bits On" sort of thing, otherwise the changing of a Boolean column to a Byte column would yield 0's and 1's, however, the result will be 0's and 255's.

Access's Yes/No Field=1 bit, VBA's boolean type=2 bytes?

The_Doc_Man

Immoderate Moderator

Banana

split with a cherry atop.

ChrisO

Registered User.

Attachments

ChrisO

Registered User.

Attachments

Banana

split with a cherry atop.

datAdrenaline

AWF VIP

gemma-the-husky

Super Moderator

The_Doc_Man

Immoderate Moderator

gemma-the-husky

Super Moderator

ChrisO

Registered User.

Attachments

The_Doc_Man

Immoderate Moderator

ChrisO

Registered User.

datAdrenaline

AWF VIP

Similar threads

Users who are viewing this thread