Go Back   Access World Forums > Microsoft Access Reference > Code Repository

 
Reply
 
Thread Tools Rate Thread Display Modes
Old 09-07-2019, 05:37 PM   #1
Micron
Newly Registered User
 
Join Date: Oct 2018
Location: Ontario, Canada
Posts: 872
Thanks: 10
Thanked 181 Times in 171 Posts
Micron has a spectacular aura about Micron has a spectacular aura about
Randomizing Data for Posted db's

We often ask Original Poster to upload a copy of their db but the issue might be that it contains sensitive data. Complete data removal may not allow us to troubleshoot the OP's issue because queries, forms, etc. need records in order to observe or trouble shoot the problem.

Here's a little function that we could point them to whereby they could, ON A COPY OF THEIR DB, randomize string field data by using an update query that calls the function below. This could solve the issue of revealing sensitive data by randomizing all or part of a string field. The function call goes into the UPDATE TO row of the query design grid and a sample sql statement would look like
Code:
UPDATE tbl1 SET tbl1.field1 = randomizedata(tbl1.field1,3);
The supposition is that it may be desirable to retain n characters at the beginning of the string and randomize the rest, hence the numeric parameter. If not, use 0 (untested). The function call would go into each text field that you want to randomize, using the proper field reference of course. The function being called is:
Code:
Function RandomizeData(strField As String, i As Integer) As String
Dim str1 As String, str2 As String

'i= count of characters from beginning of string that are to be retained. 
'Pass to Left function and assign to str1
 str1 = Left(strField, i)

'make loop counter start at position number of NEXT character which is to be replaced. 
'End value is string length.
For i = i + 1 To Len(strField)

'make str2 equal current value + a random lower case character from a to z 
'and repeat to end of counter. Spaces are retained.

If Mid(strField, i, 1) = " " Then
    str2 = str2 + Mid(strField, i, 1)
Else
    str2 = str2 & Chr((121 - 97 + 1) * Rnd + 97)
    'Debug.Print str2
End If
Next

'piece left and right parts together and return to query
RandomizeData = str1 + str2

'e.g if input value is 'Donald Duck', output string with 1st 2 characters retained might be Dosnoh itat

End Function
***NOTE***
Randomizing data is only practical IF the values are NOT PK (Primary Key) values, or if they are and exist as meaningful data, then Cascade Updates should be set in relationships between a field to be randomized and any other table where those values exist as Foreign Keys. Randomizing one side of a relationship and not the other would be pointless.

Lastly I'd mention that the user should realize that the result of the function cannot be reverse engineered; i.e. once randomized there is no way to determine what the original values were.

__________________
Sometimes I just roll my eyes out loud...
Windows 10; Office 365 (Access 2016)
Micron is offline   Reply With Quote
The Following 4 Users Say Thank You to Micron For This Useful Post:
Gasman (09-08-2019), isladogs (09-08-2019), NauticalGent (09-08-2019), Uncle Gizmo (09-08-2019)
Old 09-08-2019, 11:50 AM   #2
isladogs
Part time moderator
 
isladogs's Avatar
 
Join Date: Jan 2017
Location: Somerset, UK
Posts: 10,104
Thanks: 110
Thanked 2,746 Times in 2,505 Posts
isladogs is a splendid one to behold isladogs is a splendid one to behold isladogs is a splendid one to behold isladogs is a splendid one to behold isladogs is a splendid one to behold isladogs is a splendid one to behold
Re: Randomizing Data for Posted db's

Using this is certainly a lot quicker than the manual randomising I've used in the past to create DEMO databases for my website.

This should be a very useful reference to suggest as a solution when OPs say they cannot upload data because its confidential.

I've tested it using 0 and all characters are randomised successfully.

It would be even better to expand this idea for text field that contain a mixture of alphanumeric characters using numbers where appropriate and similarly for number & date fields
__________________
If this answer has helped, please click the Thanks button and/or click the 'reputation scales' symbol on the left.

Website links:
To view links or images in signatures your post count must be 10 or greater. You currently have 0 posts.
,
To view links or images in signatures your post count must be 10 or greater. You currently have 0 posts.
,
To view links or images in signatures your post count must be 10 or greater. You currently have 0 posts.
,
To view links or images in signatures your post count must be 10 or greater. You currently have 0 posts.


Colin
Access 2010 32-bit, Access 2016 32-bit & 64-bit, SQL Server Express 2014, Windows 10,
To view links or images in signatures your post count must be 10 or greater. You currently have 0 posts.
,
To view links or images in signatures your post count must be 10 or greater. You currently have 0 posts.
isladogs is offline   Reply With Quote
Old 09-08-2019, 04:19 PM   #3
Micron
Newly Registered User
 
Join Date: Oct 2018
Location: Ontario, Canada
Posts: 872
Thanks: 10
Thanked 181 Times in 171 Posts
Micron has a spectacular aura about Micron has a spectacular aura about
Re: Randomizing Data for Posted db's

Quote:
It would be even better to expand this idea for text field that contain a mixture of alphanumeric characters using numbers where appropriate and similarly for number & date fields
Do you mean to randomize the numbers but leave special characters alone (such as - , ( ) ' $ etc) or not randomize numbers and not special characters? Or something else?

Micron is offline   Reply With Quote
Old 09-08-2019, 08:57 PM   #4
isladogs
Part time moderator
 
isladogs's Avatar
 
Join Date: Jan 2017
Location: Somerset, UK
Posts: 10,104
Thanks: 110
Thanked 2,746 Times in 2,505 Posts
isladogs is a splendid one to behold isladogs is a splendid one to behold isladogs is a splendid one to behold isladogs is a splendid one to behold isladogs is a splendid one to behold isladogs is a splendid one to behold
Re: Randomizing Data for Posted db's

I was thinking about scrambling numbers as well as letters in fields such as tutor group (11BN), part number (RT23-TH5Q-WA) or date of birth (08/11/1982) but any special characters to stay unchanged

Ideally any scrambled date values would remain valid dates .
__________________
If this answer has helped, please click the Thanks button and/or click the 'reputation scales' symbol on the left.

Website links:
To view links or images in signatures your post count must be 10 or greater. You currently have 0 posts.
,
To view links or images in signatures your post count must be 10 or greater. You currently have 0 posts.
,
To view links or images in signatures your post count must be 10 or greater. You currently have 0 posts.
,
To view links or images in signatures your post count must be 10 or greater. You currently have 0 posts.


Colin
Access 2010 32-bit, Access 2016 32-bit & 64-bit, SQL Server Express 2014, Windows 10,
To view links or images in signatures your post count must be 10 or greater. You currently have 0 posts.
,
To view links or images in signatures your post count must be 10 or greater. You currently have 0 posts.
isladogs is offline   Reply With Quote
Old 09-11-2019, 07:15 PM   #5
Micron
Newly Registered User
 
Join Date: Oct 2018
Location: Ontario, Canada
Posts: 872
Thanks: 10
Thanked 181 Times in 171 Posts
Micron has a spectacular aura about Micron has a spectacular aura about
Re: Randomizing Data for Posted db's

Sorry for the long reply, but it's a convoluted situation.

Have been playing with this and have done limited testing; not really sure how best to approach the suggestions with respect to dates but I made a decision.

First, I'd say that dates are a non-starter for me given the different formats at play. While one could probably tweak a date based on European or US format, I don't foresee much of a need. My supposition is that the type of data that most likely needs 'protection' is names and addresses, and to a lesser extent, phone numbers (will anyone worry that someone will get their hands on randomized names, dial up the real phone number and asks to speak to "Dybqw Nbplh")? I could say "don't use it on date fields because a date of 66/81/9054 will likely cause issues beyond the one(s) you're posting about" but someone will eventually do that so I have tried to cover that situation.

Changes to look for:
- If Null wasn't captured it balked

- if used on a date field the date had to be updated to what it already was, otherwise the field values became null or "" (I didn't check which). Updating to the same value seemed pointless and raised the confirmation prompt for no reason so I had to find a way to not pass anything back to a query in that case. Answer was to set it to Nothing, which raises error 91, which is handled.

- the fact that inputs could be anything means the variables had to be changed to variants

- former code would have altered Capital Letters and special characters. Should not do so now.

- some characters in the standard set and also extended characters beyond 122 are not altered.

I try not to code in a way that causes errors to be raised only to be ignored or suppressed, but in this case it seems necessary. One thing that just occurred to me is the possibility that this code might alter hidden characters such as those used between rtf tags. I hope that some of you will put this code to the test on different field data types as I am trying to catch up on a lot of other projects lately and don't have time to do a ton of testing.

Code:
Function RandomizeData(varField As Variant, i As Integer) As Variant
Dim var1 As Variant, var2 As Variant

On Error GoTo errHandler

'i= count of characters from beginning of string that are to be retained.
'Pass to Left function and assign to str1
 var1 = Left(varField, i)
 
 If IsDate(varField) Or IsNull(varField) Then
    RandomizeData = Nothing
    Exit Function
End If

'make loop counter start at position number of NEXT character which is to be replaced.
'End value is string length.
For i = i + 1 To Len(varField)

'make str2 equal current value + a random lower case character from a to z
'and repeat to end of counter.

  Select Case Asc(Mid(varField, i, 1))
    Case 0 To 47, 58 To 64, Is > 122 'don't alter these characters
      var2 = var2 + Mid(varField, i, 1)
      
    Case 48 To 57
      var2 = var2 & Chr((56 - 48 + 1) * Rnd + 48)
      
    Case 65 To 90
        var2 = var2 & Chr((89 - 65 + 1) * Rnd + 65)
        
    Case 97 To 121
      var2 = var2 & Chr((120 - 97 + 1) * Rnd + 97)
      
  End Select
Next

'piece left and right parts together and return to query
RandomizeData = var1 + var2
'e.g if input value is 'Donald Duck', output string with 1st 2 characters retained might be Dosnoh itat

exitHere:
Exit Function

errHandler:
If Err.Number = 91 Then
    Exit Function
Else
    MsgBox "Error " & Err.Number & ": " & Err.Description
End If

End Function
__________________
Sometimes I just roll my eyes out loud...
Windows 10; Office 365 (Access 2016)
Micron is offline   Reply With Quote
The Following User Says Thank You to Micron For This Useful Post:
isladogs (09-11-2019)
Old 09-11-2019, 10:46 PM   #6
isladogs
Part time moderator
 
isladogs's Avatar
 
Join Date: Jan 2017
Location: Somerset, UK
Posts: 10,104
Thanks: 110
Thanked 2,746 Times in 2,505 Posts
isladogs is a splendid one to behold isladogs is a splendid one to behold isladogs is a splendid one to behold isladogs is a splendid one to behold isladogs is a splendid one to behold isladogs is a splendid one to behold
Re: Randomizing Data for Posted db's

Quote:
Hello is that Dibqu Nibpool … apologies if I've mispronounced your name
Thanks for doing this.
I realised it would add complications but I've done some limited testing as follows
1. Text including retaining capital letters - perfect. I'm glad you made that change
2. Number or number/text fields such as 11BN - also perfect
3. Date fields are easy enough to handle
Instead of using RandomizeData(DOB,0), I used
Code:
CDate(randomizedata(CLng([DateOfBirth]),0))
This also means no need to worry about US or UK date formats
That could probably be incorporated into the code where you test for IsDate but I just did it in the query as it was a quick test.

No time to test rich text fields. Suggest possibly apply the Plaintext function in such cases before randomising?
__________________
If this answer has helped, please click the Thanks button and/or click the 'reputation scales' symbol on the left.

Website links:
To view links or images in signatures your post count must be 10 or greater. You currently have 0 posts.
,
To view links or images in signatures your post count must be 10 or greater. You currently have 0 posts.
,
To view links or images in signatures your post count must be 10 or greater. You currently have 0 posts.
,
To view links or images in signatures your post count must be 10 or greater. You currently have 0 posts.


Colin
Access 2010 32-bit, Access 2016 32-bit & 64-bit, SQL Server Express 2014, Windows 10,
To view links or images in signatures your post count must be 10 or greater. You currently have 0 posts.
,
To view links or images in signatures your post count must be 10 or greater. You currently have 0 posts.
isladogs is offline   Reply With Quote
Old 09-12-2019, 07:02 AM   #7
Micron
Newly Registered User
 
Join Date: Oct 2018
Location: Ontario, Canada
Posts: 872
Thanks: 10
Thanked 181 Times in 171 Posts
Micron has a spectacular aura about Micron has a spectacular aura about
Re: Randomizing Data for Posted db's

As written, it will alter rtf tags; e.g. <div> could become <arw> which will mess up the field value. As for dates, I tested with your expression and had a type conversion error for one record. I'm still not convinced that altering dates so dramatically will be important for those who have data they want to randomize, and it may mess up code or sql that uses operators on them. I suppose anyone could use your expression easily enough should they decide to randomize dates.

As for rtf, not sure when I'll have time to do anything but will try to get back to it. Even though it's raining, I still have stuff I'm supposed to be doing.

Micron is offline   Reply With Quote
Old 09-16-2019, 01:51 AM   #8
moke123
Me.Dirty=True
 
moke123's Avatar
 
Join Date: Jan 2013
Location: Massachusetts
Posts: 737
Thanks: 2
Thanked 226 Times in 212 Posts
moke123 will become famous soon enough
Re: Randomizing Data for Posted db's

I was away for a few days but this thread was on my mind. I was thinking that it might be difficult for some posters to execute code from a module.
I threw together a form that could be imported into a COPY of the database. You can then select the tables and fields to randomize. Currently it works on text, number, and date fields. The primary key field is excluded from the list as is boolean fields, ole, attachments, calculated.etc. Foreign keys are listed and caution should be used not to alter them.
If you update your randomization code, it shouldn't be too difficult to incorporate it into the form.
Attached Files
File Type: accdb randomizer.accdb (684.0 KB, 1 views)

moke123 is offline   Reply With Quote
Reply

Thread Tools
Display Modes Rate This Thread
Rate This Thread:

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Never posted before Alfisdad Introduce Yourself 1 12-18-2014 07:59 AM
Randomizing Query for Multiple fields GlennOwns Queries 2 03-30-2012 05:55 PM
DlookUp problem posted with data hmho Forms 3 06-24-2009 02:11 PM
Randomizing table records l0is Tables 3 02-09-2008 11:09 AM
Randomizing a Resultset 2 ajetrumpet Modules & VBA 32 01-23-2008 05:28 PM




All times are GMT -8. The time now is 07:42 PM.


Microsoft Access Help
General
Tables
Queries
Forms
Reports
Macros
Modules & VBA
Theory & Practice
Access FAQs
Code Repository
Sample Databases
Video Tutorials

Featured Forum post


Sponsored Links


Powered by vBulletin®
Copyright ©2000 - 2019, Jelsoft Enterprises Ltd.
(c) copyright 2017 Access World