Database field type theory (is it ever ok to store numbers as text?)

thebatfink

Well-known member
Joined
Mar 29, 2012
Messages
47
Programming Experience
Beginner
Hi,

I am creating a database. I have a field whose data can be in one of two states.. either the field is not valid for the particular record in which case I guess the user would input 'N/A' or it is valid and then all input data will be numeric.

I'm just curious on whether thats a sound decision considering standard practise?

The reason I side on the text field solution to allow a number or a string of 'N/A' is because I need to display N/A on the reports generated from the data (I can't have empty or blank fields printing out, it must be explicit that there is no data to show.. ie it isn't a printing error for instance).

Anyone care to offer an opinion on the correct way of handling this situation?

Thanks!
 
It's generally an accepted rule to keep your data in it's native form - a number should be stored as a number. The question is, how will that field be used? if you're going to use those number strings as numbers and calculate totals, then you're going to need to convert those strings.

Personally, i'd store them as numbers and use, perhaps, -1 to denote the N/A state - a little bit of logic on your report output should be able to interpret the -1 (or whatever you choose) as a 'N/A'
 
Empty values in database is stored as 'null' when field is configured to allow nulls, in managed code these are handled as DBNull values.
In UI parts you can display any relevant string when dbnulls is present, such as set Binding.NullValue or DataGridViewCellStyle.NullValue to "N/A" string.
I don't know what you mean by "reports" exactly, but it is likely you also here can handle dbnulls to display appropriate formatting strings.
 
Does that also apply to empty text fields? ie if I edit a record via a form, by backspacing the charactors from the field.. will that result in a Null value or a zero length string? I remember having to test for Null and "" because records with a value, which were subsequently edited to be blank, resulted in zero length strings rather than Nulls (these could of been text fields though and now its in the back of my mind all the time).

I'm assuming with a numeric field type, zero length would always be Null?!

Its the not knowing if my field will end up empty or null I guess which is driving me to always populate the fields with 'something'. -1 could also be a solution, I hadn't thought of that.
 
Since this is being used for a report, I would assume the possibility of it needing to be used in calculations is there & I would also assume you cannot use a specific number to indicate no data either. What I would do is use a numeric data type in the database & allow nulls, then in your app have it display "N/A" for the records where the value is null, JohnH has already commented on this with examples of how to handle the nulls in your code too.
 
NullValue works both ways, if you input that value then binding will translate it to the DataSourceNullValue which defaults to DBNull.Value.
There is a difference in not setting NullValue and setting it to for example String.Empty. When NullValue has not been set then that means Nothing (managed null reference), that is not a value you can write in a textbox, when you clear textbox it will be a zero length string, which is a valid string value. If you have set NullValue to String.Empty the zero-length string will be equal to NullValue and translated to DBNull.Value by binding. In a DatagridView cell you can either input the designated NullValue (String.Empty is valid here too) or press Ctrl+0 to set the cell to null.
 
Hi.

So I now have some things to consider then. In the past with a different database, I encountered issues with people 'forgetting' to populate certain fields. So it's always on my mind to error proof the data input somehow. I had convinced myself allowing NullValues was a bad thing simply because I can't prevent data input being missed / skipped over. In this same example DB (originally created in Access) reports where used to create record specific documents used in a different process. These documents could not contain blank areas, if a specific value was not applicable, it had to explicitly state it wasn't. So this leads me to where I am now and my uncertainty.

I can allow nulls and then test for these wherever display of the value occurs and change it to N/A. But then there is the posibility of elements being skipped over when data entry occurs by users.
Or I disallow nulls, prevent missing data, but I have to store my (mainly) numeric data as text to allow for the instances when 'N/A' will need to be manually populated - or I got for the '-1' entry value which would certainly work in my case.

Most of the fields in question are holding things such as Temperatures, so what calculations would need to be performed I guess I could cope with parsing strings into numeric formats, but I sometimes like to search on these fields and such - plus if its numeric only, it again ensures someone inputs a number and not some other random phrase they think is appropriate.

I'm feeling there are benefits to both approaches and probably in my case as the numbers are not for use in scientific calculations, its probably ok to have them in either format?

Thanks for the help.
 
Preventing missing data has to do with validation, you don't need to save incomplete data entries in database - and if you do, allowing nulls is a better option than using any arbitrary value that has that meaning.
I don't see why there would be a reason to store specific type data like numbers/dates/booleans as strings.
 
Data should always be stored in the intended type for reason that if you ever have to work with it (such as summate a million numbers) you just do the work without the conversion (why convert a million stringsto number to addthem every time the query is run)

Null should be used if there is no data for that record on that row

Formatting of data (including turning NULL into a "n/a" string) should be done at query run time

If your data can frequently contain not-applicables consider that the data is being stored in the wrong place. If we were building a warehouse inventory that sold shoes, we wouldnt have a table like
[productid],[reebok_model],[nike_model],[adidas_model],[skechers_model]
1,reebok zig,n/a,n/a,n/a

ie, we've got 3 columns here that can never be aplicable to the shoe and one that is. when we input a nike shoe, the reebok one will be n/a

We minimise this problem by designing it out.. if a row is full of NA consider if it really should be in that table at all.. the database design is probably flawed
 
Back
Top