Invisible Data Problem

(Ref Id: 1437921307)

Continuing our work on permissions let's bring our attention to a simple problem that might be occurring with your system -- the so-called Invisible Data Problem.

Let's create a few files to use:

cd /tmp
mkdir invis
cd ./invis
echo 100 > afile.txt
echo 200 > afile2.txt
echo 300 > afile3.txt
cat afile*

Our output now is:


Let's change the ownership of a few of these files to someone else and remove the 'everyone' read privilege:

sudo chown postgres afile2.txt afile3.txt
sudo chgrp postgres afile2.txt afile3.txt
ls -l
cat afile*
sudo chmod o-r afile2.txt afile3.txt

Now we have:

-rw-r----- 1 postgres postgres 4 Jul 12 12:05 afile2.txt
-rw-r----- 1 postgres postgres 4 Jul 12 12:05 afile3.txt
-rw-r--r-- 1 limsexpert limsexpert 4 Jul 12 12:05 afile.txt

Let's try to see the contents:

cat afile*

Correctly, this outputs:

cat: afile2.txt: Permission denied
cat: afile3.txt: Permission denied

There are three files that result from the instruction, but two of them are inaccessible to the current user, so the system tells you as much.

Thus far everything is okay. We know that there are inaccessible files here and our attempt to get all of the values is resulting in an error. But what if the system were coded to ignore files that we do not have access to?

cat *.txt 2>/dev/null

Above we ignored the error output (that is the meaning of the 2>/dev/ took the error output and threw it away). Now we incorrectly think that we only have one number -- 100.

This problem occurs all too often in systems that have group settings on data but no way of telling the process or user consuming that data that he/she/it does not have sufficient privileges to access it all. This is the 'Invisible Data Problem' and there really is no single solution.

Tighter Queries

You might be asking, 'what if we write tighter queries for the data?' If we know what groups our user belongs to we could write our query to specifically limit what is retrieved.

We can emulate this by using the find utility:

find . -group limsexpert -exec cat {} +

Actually this makes the problem worse since the other files contain pertinent data but still we cannot see any of it. What we have done is mask this problem by making our query more specific. The problem is that the records are set up incorrectly for the operation or this particular operation should avoid the group settings entirely:

sudo cat *.txt

Here 'sudo' allows one to act as the superuser. Since the superuser can 'see everything' the query works as expected.

LIMS Application/Summary

So in a LIMS we want permissions on records but there are instances where the underlying data is not set up as one would expect. In these cases it would be helpful if there was an equivalent to the error channel in an operating system that we could use. Without this we are left to perform tests similar to those above -- purposefully adding records to sets with the wrong group settings to see if they affect the output in a negative way.

Armed with important information about how our LIMS handles the Invisible Data Problem we can either change our processes (making the data 'correct' at the right times) or rewrite our queries to be all inclusive and thus avoid the group problem entirely.

Go Back

Citation: Invisible Data Problem. (2015). Retrieved Wed Mar 22 22:07:09 2017, from;iid=readMore;go=1437921307