issues and suggestions

Jan 16, 2013 at 12:52 AM
Edited Jan 16, 2013 at 12:59 AM

1. give a simple DbCustomSerializer sample on how to implement our own DbCustomSerializer, since the built-in DbMJSON is fairly slow: for a simple object with 5 different simple data types, 1 mln records, takes 20 seconds. But for a simple string value, it can finish within 2 seconds.

2. DbCustomSerializer return as byte[] rather than string? For normal usage, developer will serialize POCO and get it back, although the developer should not really bother how the storage works, but I would like:

  

using System;
using System.Diagnostics;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using DBreeze;
using DBreeze.DataTypes;

namespace ConsoleApplication18
{
    class Program
    {
        static void Main(string[] args)
        {
            var w = new Stopwatch();
            w.Start();
            using (var engine = new DBreezeEngine(@"C:\Temp\bb"))
            {
                using (var tran = engine.GetTransaction())
                {
                    DBreeze.Utils.CustomSerializator.Serializator = MySerializer;
                    DBreeze.Utils.CustomSerializator.Deserializator = MyDeserializer;

                    try
                    {
                        for (int i = 0; i < 1000000; i++)
                        {
                            tran.Insert<int, DbCustomSerializer<Foo>>("t2", i, new Foo { Double = i * 1.1, Long = (ulong)i, Name = "Name " + i, Time = DateTime.Now, Value = i });
                        }
                        tran.Commit();
                    }
                    catch (Exception ex)
                    {
                        Console.WriteLine(ex.ToString());
                    }

                    var data = tran.SelectForward<int, DbCustomSerializer<Foo>>("t2").Take(100);
                    foreach (var item in data)
                    {
                        Console.WriteLine(item.Key + "," + item.Value); //value should be the actual Foo object
                    }
                }
            }
            w.Stop();
            Console.WriteLine(w.Elapsed);
            Console.WriteLine("done");
            Console.Read();

        }
    }

    class Foo
    {
        public string Name { get; set; }
        public int Value { get; set; }
        public DateTime Time { get; set; }
        public ulong Long { get; set; }
        public double Double { get; set; }
    }
}


3. possible Serializator/Deserializator with generic support?

4. does it come with compression?

Coordinator
Jan 16, 2013 at 7:49 AM
Edited Jan 16, 2013 at 7:51 AM

Look at

http://fastjson.codeplex.com/

or

http://code.google.com/p/protobuf-net/

 

they serialize - you store in DBreeze value as serilaized byte[].

 

Actually, DBreeze doesn't need anything more then ability to store value as byte[], the rest is a higher level. Programmer can serialize, compress by himself using his favorite technology.

Jan 16, 2013 at 8:02 AM
Edited Jan 16, 2013 at 8:02 AM

Imagine a system having hundreds of complex POCOs, it's not possible to write own serializer for each POCO, so there has to be some sort of easy to use serializer, and the process should be transparent to the developer, because IMHO I don't really like to mess up with byte[], for example:

1. to serialize:  

 

tran.Insert<int, Foo>("t2", i, new Foo { Double = i * 1.1, Long = (ulong)i, Name = "Name " + i, Time = DateTime.Now, Value = i });

 

1. to deserialize:  

 

var data = tran.SelectForward<int, Foo>("t2").Take(100);
foreach (var item in data)
{
    Console.WriteLine(item.Key + "," + item.Value.Time); //value should be the actual Foo object
}

 

However, developer should still have the ability to customize serializer.

Coordinator
Jan 16, 2013 at 8:11 AM

Check in source code file DBreeze.DataTypes.DbCustomSerializer.cs, may be it will help?

Jan 16, 2013 at 11:25 AM

In my first post I already use DbCustomSerializer but there is performance and usage concern. What I am saying is it would be better to have built-in fast serializer to serialize POCO to and from transparently.

Coordinator
Jan 16, 2013 at 11:48 AM

Which serializer did u use here?

DBreeze.Utils.CustomSerializator.Serializator = MySerializer;

Did u measure the pure serialization time of 1MLN of Foo objects, without inserting to DB, just pure serialization?

Also add time of converting serialized string into byte[] using UTF-8 encoding. May be it's ok?

 

We don't use JSON or XML at all in our business for huge amount of records - only optimal byte[], we like it much more - faster read/write, much less space.

If we need to store some thousands of objects in a table, we can use integrated Microsoft JSON - not a big problem.

Built-in serializers, compressors etc. is the next abstraction level. You can try to create it yourself, but first, please, study carefully documentation, make many tests, feel database.

Jan 16, 2013 at 8:02 PM

I am still reading the documents and I haven't done any actual perf comparison.

I will  never use JSON or XML in our product due to high volume data, performance is always my top priority to look for a solution. And usability is my second concern, I will explain in detail in my test.

I wrote some serializers before and I will write one specially for this test. 

I will come back to you after work.