Removed keys count statistical information / compaction trouble.

Mar 20, 2013 at 1:01 AM
Is there any way to get statistical information on how many keys in the table are inactive
(removed). It is a very useful piece of information and can be used to determine the user
desired threshold at which point table should be compacted.
Mar 20, 2013 at 1:16 AM
Also I see/have a huge problem with multi-threaded application and compaction. Your documentation specifies a way
to do compaction of a table. Now You wrote that best way is to do it after engine initializes and by only one thread.
Well what if an application is a long running multi-threaded application ? How to solve this problem than ?
RenameTable operations are not transaction specific and outside of transaction table can change before rename
occurs, data can be lost.
Basically what I want to do is periodic compaction, while application is running. I cannot afford to restart or do it at
initialization only. Now I wonder can You suggest a good way to do that ?
Coordinator
Mar 20, 2013 at 9:17 AM
  1. There is no way to see how many keys were removed. You can count it by yourself and store in other table this info.
  2. I don't recommend you to compact data at all. It's not necessary at all or should be done very rare today. If you really need to compact data, then stop the engine, compact all, start the engine.
Coordinator
Mar 20, 2013 at 9:33 AM
If you want to compact data on hot DB, then you have to prepare that some reading threads CAN HAVE reading exceptions, during small period of time, while Renaming of the table occurs. Writing threads will be untouched, because rename can be done inside of the transaction, where you preliminary put the renaming table name into synchro list.
Mar 25, 2013 at 12:42 PM
Edited Mar 25, 2013 at 12:43 PM
Blaze one question/request. Is it possible to change code so that Transaction.SynchronizeTables could be called multiple times before tables are accessed/read/written ? I know what You wrote about deadlock handling and it is logical to do it in a single statement but sometimes it is very impractical DAL wise.
For example what would differ if You did:
using (Transaction transaction = engine.GetTransaction ())
{
 transaction.SynchronizeTables ("table1");
 <some non-transaction code>
 <some non-transaction code>
 transaction.SynchronizeTables ("table2");
 <some non-transaction code>
 <some non-transaction code>
 transaction.Insert (); <- start table locking when this is invoked.
 transaction.Remove();
 transaction.Insert ();
 transaction.Select ();
}
Before transaction.insert or any other similar querying/altering statement what would be the problem to lock tables
then ? Know what I mean ?
Coordinator
Mar 25, 2013 at 1:04 PM
Deadlocks can be possible in your example.

So, the answer is no - not possible.
Coordinator
Mar 25, 2013 at 1:10 PM
if in parallel other thread will run smth. like this:

using (Transaction transaction = engine.GetTransaction ())
{
transaction.SynchronizeTables ("table2");
<some non-transaction code>
<some non-transaction code>
transaction.SynchronizeTables ("table1");
<some non-transaction code>
<some non-transaction code>
....
Coordinator
Mar 25, 2013 at 1:13 PM
OK, I got your idea with transaction.Insert (); <- start table locking when this is invoked....
:)
but it's the same like

List<string> tablesToBeSynchronized=new List<string>();
tablesToBeSynchronized.Add("table1");
<some non-transaction code>
<some non-transaction code>
tablesToBeSynchronized.Add("table1");
<some non-transaction code>
<some non-transaction code>

transaction.SynchronizeTables (tablesToBeSynchronized);
...
transaction.Insert ();
transaction.Remove();
transaction.Insert ();
transaction.Select ();
Coordinator
Mar 25, 2013 at 1:19 PM
Edited Mar 25, 2013 at 1:20 PM
Actually this pattern, where you open transaction, then you collect tables, which must be synchronized, with Select statements, then you call SynchronizeTables, then execute tables changes - is the common and recommended pattern to be used.
To avoid deadlocks you must have one synchro-point, that what we achieve with single SynchronizeTables per transaction.
Mar 25, 2013 at 8:53 PM
True and I agree with You but sometimes this approach can produce a lot more unnecessary repetitive code. I would consider this
actually to be a very usable feature. The only thing that would actually need to be changed IMO is Transaction.cs, that is wrap
table alteration/query methods and do the synchronization as You described above.
Consider this very trivial example:
void Method1 (Transaction transaction)
{
 <non transaction code>
 transaction.SynchronizeTable ("table1");  // Because some non transactional code decided this table should be locked.
 <non transaction code>
 <non transaction code>
 <non transaction code>
}

void Method2 (Transaction transaction)
{
 <non transaction code>
 <non transaction code>
 transaction.SynchronizeTable ("table2"); // Because some non transactional code decided this table should be locked.
 <non transaction code>
 <non transaction code>
 <non transaction code>
}

using (Transaction transaction = engine.GetTransaction ())
{
 <some non-transaction code>
 <some non-transaction code>
 Method1 (transaction);
 <some non-transaction code>
 Method2 (transaction);
 <some non-transaction code>
 <some non-transaction code>
 transaction.Insert ();
 transaction.Remove();
 transaction.Insert ();
 transaction.Select ();
}
I will survive without this, although if You decide to include it ever, it would be a neat and useful feature.
Coordinator
Mar 25, 2013 at 10:31 PM
I didn't get the idea of your example.

SynchronizeTable immediately marks table(s) as sync tables, and is allowed only once per transaction, to avoid deadlocks.

Probably, you use SynchronizeTable, in your example, as a collector of tables to be synchronized and you want that first modification command, like Insert or Delete,
had to boost real synchronization?

If so, there are some reasons why I wouldn't do that.
First, every modification command must receive extra check (above conditions) ...what makes program slower in total.
Second, the SynchronizeTable command gives you "feeling" when real transaction starts.
Third, not really much to code here:
void Method1 (Transaction transaction, List<string> tbls)
{
 <non transaction code>
  tbls.Add ("table1");  // Because some non transactional code decided this table should be locked.
 <non transaction code>
 <non transaction code>
 <non transaction code>
}

void Method2 (Transaction transaction, List<string> tbls)
{
 <non transaction code>
 <non transaction code>
 tbls.Add("table2"); // Because some non transactional code decided this table should be locked.
 <non transaction code>
 <non transaction code>
 <non transaction code>
}

using (Transaction transaction = engine.GetTransaction ())
{
 List<string> tbls = new List<string>();

 <some non-transaction code>
 <some non-transaction code>
 Method1 (transaction, tbls);
 <some non-transaction code>
 Method2 (transaction, tbls);
 <some non-transaction code>
 <some non-transaction code>

 transaction.SynchronizeTable (tbls)

 transaction.Insert ();
 transaction.Remove();
 transaction.Insert ();
 transaction.Select ();
}