Sequential updates to the same data - data loss question

Do you have a question? Post it now! No Registration Necessary.  Now with pictures!

Threaded View

I'm posting here for lack of a better place.  Definitely not related
to either
database or flat files but this is definitely related to web
programming and
I'm presently developing in PHP.

I'm looking for some additional strategies to deal with lost data due
to sequential updates.

A simplified example would be a phonebook application available to be
by many users:
  - User A opens a record
  - User B opens the same record
  - User A and B change the phone number in the record to different
numbers, respectively.
  - User A submits the change
  - User B submits the change
In this example, user A's changes were overwritten and user A does not
about it.  Additionally, user B did not know that user A was also
Ideally they both changed the phone number to the same value and no
one is the
wiser, but its not a good thing to have happen.

As explained in other posts, an edit lock on the record/file would not
help anyone as it would remain locked if the user lost connection or
and the javascript ping or cleanup script just seems too much.  I'd
prefer a
more dynamic solution.

The strategy I've used so far is to store the original data in the
form as well
as the data being changed.  Upon submission, if the data does not
then user "B" would be given an error and have to go back and re-load
"A"'s changes then re-apply their own.  Not a perfect solution but
resolves the
overwrite, even if it is at the expense of user "B" having to re-enter

I have noticed that most Wiki software does this but goes the extra
step in
attempting some either automatic or semi-automatic reconciliation
between what
the user entered and what was updated since the user started.

In my application if the data being changed is in separate sections of
record then it just happens automatically (e.g. one user changing the
while the other changing the phone number) so this would only apply to
the same

Are there other strategies that can work here?  I've addressed edit
locking and
original data comparision.

I guess what I'm trying to find out, is there anything written on this
or does anyone have some thoughts on the matter?


Re: Sequential updates to the same data - data loss question

On Mon, 7 Dec 2009 11:36:08 -0800 (PST), Mycroft wrote:
Quoted text here. Click to load it

Pretty much a FAQ at this point...

Quoted text here. Click to load it

And pretty much the de facto solution.

Quoted text here. Click to load it

Yup. But its doing so based on the idea that most of those concurrent
updates are working on DIFFERNT PARTS of the record. Essentially as
though User A updates the phone number, and user B updates the email

As with ...

Quoted text here. Click to load it

TBH, these steps are enough so often that there's little reason to do
more. In almost every case, the number of records in a database will
scale much faster than the rate of updates.

3. My noble half-brother whose throne I usurped will be killed, not kept
   anonymously imprisoned in a forgotten cell of my dungeon.
        --Peter Anspach's list of things to do as an Evil Overlord

Re: Sequential updates to the same data - data loss question

Quoted text here. Click to load it

This is certainly the only reliable method. I have seen code where the
developer flagged the record as reserved for a particular session -
but that then gets complicated to clean up when the session never
sends back an update.

But you don't need to store all of the data between the fetch and the
update - you could just store a hash - but this still requires an
additional fetch operation before carrying out the update to confirm
the row hasn't changed. A better solution is to keep a timestamp or a
version number on the record, then....

$qry=3D"UPDATE record SET value=3D$new_value WHERE id=3D$primary_key AND
if (!mysql_affected_rows(mysql_query($qry)) {
  print "try again - someone else has modified the record";

Quoted text here. Click to load it

While it would be possible to try to automatically reconcile the
record, this pre-supooses that there are no dependencies betwen the
columns in the business logic.

Quoted text here. Click to load it

Yes - you can use an advisory locking system like the one I described
previously, but it's really not worth the effort - if access conflicts
are likely to occur then you'll just end up employing someone to
remove the stale locks.


Re: Sequential updates to the same data - data loss question

Mycroft wrote:
Quoted text here. Click to load it
Quoted text here. Click to load it

This is pretty much a classic problem, so go grab a book on concurrent

There will be a variety of solutions, but almost all of them involve
identifying a critical section and implementing some kind of locking

Iván Sánchez Ortega -ivan-sanchezortega-es-

You never know how many friends you have until you rent a house on the

Re: Sequential updates to the same data - data loss question

Quoted text here. Click to load it

Yes, but it's very important in a web setting (or anywhere else,
but it's especially a problem on the web because of the way HTTP
works) that you do NOT hold a lock while waiting for user interaction,
since a user may leave for lunch, go on a 2-week vacation, quit,
die, or otherwise fail to finish the edit.  A strict locking mechanism
might hold up payroll until the edit is finished, something which
will not be popular with other employees.  Given the way PHP pages
are run, it's difficult to violate this anyway, as PHP pages aren't
running while waiting for the user to submit a form.

It's still important to lock the record while updating it (although
it may not be necessary to use an explicit lock if you can make
the whole change with one SQL UPDATE query).  If the data is kept
in a file rather than a database, consider use of file locking.

Original data comparison in the strict form rejects the edit if any
of the original data has changed.  This also happens to reject
submitting the same form twice, which is probably desirable.

A slightly relaxed form lets you divide the data into "sections",
where changes to data in different sections are presumed independent.

For some fields, where the new value of the field is NOT normally
computed from the old value (applies to things like phone number,
address, etc. but NOT salary, or any year-to-date total of something
field, etc.), if the value in the record is what the change wanted
to change it to, you can consider the change already made and not
report an error.

Two people changing the same field to the same thing is NOT necessarily
problem-free if you ignore one change.  If employee C gets a 5%
individual raise and everyone (including C) gets an across-the-board
5% raise, both changes hitting C's record at the same time cheat C
out of part of his raise.

Site Timeline