Sign in to follow this  
Followers 0
tsichevski

StackOverflowError on relatively large transactions

10 posts in this topic

Hi,

I'm getting an StackOverflowError at the commit phase when I'm trying to load my database in one transaction. The database is of reasonable size - it contains ~ 350 000 objects, the file size is ~ 70M. The database contents are several interconnected trees.

The stack trace looks like

 	java.lang.StackOverflowError
	at org.garret.perst.impl.StorageImpl.getOid(StorageImpl.java:5397)
	at org.garret.perst.impl.StorageImpl.storeObject0(StorageImpl.java:3422)
	at org.garret.perst.impl.StorageImpl.swizzle(StorageImpl.java:3507)
	at org.garret.perst.impl.StorageImpl.swizzle(StorageImpl.java:4563)
	at org.garret.perst.impl.StorageImpl.packObject(StorageImpl.java:4860)
	at org.garret.perst.impl.StorageImpl.packObject(StorageImpl.java:4551)
	at org.garret.perst.impl.StorageImpl.storeObject0(StorageImpl.java:3434)
	at org.garret.perst.impl.StorageImpl.swizzle(StorageImpl.java:3507)
	at org.garret.perst.impl.StorageImpl.swizzle(StorageImpl.java:4563)
	...

If I increase the thread stack size from default 4M to 8M with the Java -Xss option, the problem disappears. But it is not a good solution since it increases stack size for all threads, which is a waste of memory.

Is there a way to limit the recursion depth in Perst? Or control the traversal order?

Regards,

Vladimir

Share this post


Link to post
Share on other sites

The Perst serializer performs recursive traversal of all accessible objects. The order of traversal depends on order of fields in the class and can not be altered. It is possible to explicitly cut off recursion in Perst by redefining the Persistent.recursiveLoading() method. In this case, the components of this class should be explicitly loaded using the Persistent.load() method.

The general solution of the problem is to avoid use of "hand-made" collection classes (or standard Java class library collection classes) and maintaining relations between classes using direct pointers. Instead, Perst collections should be used. The intended data model is that there are one or more top level Perst collections for locating object roots by keys and that there are relatively small "clusters" of interconnected objects (objects are connected using standard Java references). Such clusters are automatically loaded by Perst when one of its objects is accessed. If the size of the clusters are not very large (< 100 objects), then stack overflow during serialization should never happen.

Share this post


Link to post
Share on other sites

Hi,

may be, I did not make myself clear. The problem appears when I am creating the database, i.e. when I am initially storing too many just created in-memory objects to the DB file, not when I am loading data from DB file back to memory.

And yes, the data model I use matches your description, so I had no problem reading data.

May be, a different solution may be used for traversing objects: store objects waiting for serialization in a java collection instead of Java VM stack?

Regards,

Vladimir

Share this post


Link to post
Share on other sites

If the problem happens while storing data in the database, not while loading it, then it can be solved either by using shorter transactions, or by using the Persistent.store() method instead of Persistent.modify(). Persistent.store() method is less efficient if object is updated multiple times, because it causes immediate write of object to the disk. But in this case, the number of objects pinned in memory is reduced and won't cause stack overflow.

Unfortunately, if you are using serialiazable transactions, the Persistent.store method cannot be used (because in this case modified objects cannot be stored on disk before transaction commit. So in this case the only choice is to perform commit more frequently, although it also has a negative impact on performance.

Share this post


Link to post
Share on other sites

If the problem happens while storing data in the database, not while loading it, then it can be solved either by using shorter transactions, or by using the Persistent.store() method instead of Persistent.modify(). Persistent.store() method is less efficient if object is updated multiple times, because it causes immediate write of object to the disk. But in this case, the number of objects pinned in memory is reduced and won't cause stack overflow.

Unfortunately, if you are using serialiazable transactions, the Persistent.store method cannot be used (because in this case modified objects cannot be stored on disk before transaction commit. So in this case the only choice is to perform commit more frequently, although it also has a negative impact on performance.

IMHO this is not a very good solution. One of the most attractive Perst feature is that the application should not know in advance whether an object will persist or not until the transaction is committed. Also it should not (and could not) know how big the transaction will be. Also, splitting a transaction breaks the basic transaction logic: either all changes must be applied or no changes at all.

Share this post


Link to post
Share on other sites

Agreed. Can you send your database schema and code that populates the database and causes the stack overflow, so we can investigate further? email support-at-mcobject-dot-com if you need to upload to our ftp server.

Share this post


Link to post
Share on other sites

Agreed. Can you send your database schema and code that populates the database and causes the stack overflow, so we can investigate further? email support-at-mcobject-dot-com if you need to upload to our ftp server.

I'm afraid, I could not extract the 'model' part and loading code from the real application. Probably, I can devise a simple test example.

Share this post


Link to post
Share on other sites

Here is the simple example: a linked list.

The link class:

package testperst;

import org.eclipse.jdt.annotation.NonNull;
import org.eclipse.jdt.annotation.Nullable;
import org.garret.perst.PinnedPersistent;
import org.garret.perst.Storage;

public class Link extends PinnedPersistent {
  public Link(@NonNull Storage storage, @Nullable Link next) {
    super(storage);
    this.next = next;
  }

  public Link getNext() {
    return next;
  }

  private final @Nullable Link next;
}
 

and the test method resulting in the StackOverflow

  @Test public void testLongLinkedList() throws Exception {
    File file = new File(TEST_DBS);
    file.delete();

    Storage db = getStorage();
    assert db != null;
    Link link = new Link(db, null);
    for(int i = 0; i < 2000; i++) {
      link = new Link(db, link);
    }
    db.setRoot(link);
    db.close();
  }

The error appears if the number of links more than 1200. I do not adjust the stack size.

Share this post


Link to post
Share on other sites

This is what we tried to explain before: do not organize your own collections (list, ...) and you will not get stack overflow.

There isn't a general solution of the problem, given Perst's architecture: how to avoid stack overflow in case of serialization of arbitrary user defined classes. In this particular case it is possible to cut tail recursion because there is just one reference in the class. But this trick will not help in case of L2-list...

Share this post


Link to post
Share on other sites

This is what we tried to explain before: do not organize your own collections (list, ...) and you will not get stack overflow.

There isn't a general solution of the problem, given Perst's architecture: how to avoid stack overflow in case of serialization of arbitrary user defined classes. In this particular case it is possible to cut tail recursion because there is just one reference in the class. But this trick will not help in case of L2-list...

I do not organize my own collections. This example I sent just demonstrates how to break the program with relatively short linked list.

In my 'real' application I manage a few tree-like structures with multiple mutual links between nodes in different trees. Any node may have an arbitrary (though usually small, 0 or 1 link) number of links to nodes in other trees. Links are implemented as Set objects.

In this model I can easily imagine Perst can travel a long distance through the links between the trees, and I can currently do nothing about it.

I think, it is possible to use some Java collection to compute the set of reachable objects instead of using recursion which relies on JVM stack.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0