Jump to content
McObject Forums
Sign in to follow this  
leftwing47

Disappearing Objects?

Recommended Posts

leftwing47    0

Hi,

I'm currently building out an application which - due to various requirements for concurrency, low hardware costs, sheer breadth of task and cumulative size of data - is using Perst as persistent "temporary" storage. Basically building up an aggregated view of the data during a long running, multi-threaded interrogation of a large backing store.

The objects themselves are nested under the root in a shallow hierarchy of treemaps (order being important), using this structure:

public class RootClass extends Persistent {
public FieldIndex<Host> hostKeyIndex;
}
public class Host extends Persistent {
private String hostId;
private TreeMap<String, Page> pages;
}
public class Page extends Persistent {
private String pageId;
private TreeMap<String, Keyword> keywords;
}
public class Keyword extends Persistent {
private String keywordId;
private TreeMap<Integer, Integer> instances;
}
Now, I initiate Perst in this manner:
perst = StorageFactory.getInstance().createStorage();
perst.setProperty("perst.concurrent.iterator", true);
perst.setProperty("perst.alternative.btree", true);
perst.setProperty("perst.serialize.transient.objects", true);
perst.open("somefile.odb", pagePoolSize)
And proceed to write data in a seemingly normal way - in my worker threads I'm using a locking object and writing objects to perst inside serializable transactions. Everything appears fine.
Now, the problems arise when I reach the end of my long running job and I start to iterate through the objects stored in perst. In shorter runs, generating less data, everything appears to work fine. In longer runs, generating significantly more data, some or all of the objects stored under the root seem to just, well, vanish! Something seems to be amiss with my use of Perst and it is seemingly to do with objects being persisted to RAM instead of disk. If I run a job with an infinite page pool size and null file, the job completes perfectly.
Does anyone have any advice on using perst in this manner?
Thanks!

Share this post


Link to post
Share on other sites
perstmco    0

Sorry, it is not completely clear what you mean by "all of the objects stored under the root seem to just, well, vanish" and "objects being persisted to RAM instead of disk".

Do you mean that objects are not written to the disk (stubs are fetched from the disk)?

Most likely there is something wrong with objects reachability.

We suspect the bug is in the code storing data in the database.

Can you send an sample example reproducing the problem? Or at least the fragment of code that saves data in Perst storage?

Share this post


Link to post
Share on other sites
leftwing47    0

Hi,

Thanks for replying. I can't access the code from my current location, but will be able to next week. I'll try and describe the symptoms a little better though!

The process in all cases follows the same course:

1 Create the perst root, then create and save new host objects, adding them to the host key index.

2 Create new page objects, then store them in the pages map under the relevant host object.

3 Create new keyword objects, then store them in the keywords map under the relevant page object.

So, effectively persisting a tree like structure in perst, accessible via the host key index.

Now, when I initialize perst with an infinite page pool the entire structure can be iterated upon via the host key index and nested objects are retrieved intact. So far, so good.

But, when I limit the page pool size, I can iterate through the correct number of host objects via the host key index, but the nested objects no longer exist. This only happens on larger datasets, making me think that processing datasets which exceed the page pool size is causing corruption or odd behaviour.

Does that help in lieu of code, which I will endeavour to supply next week?

Many thanks in advance!

Share this post


Link to post
Share on other sites
perstmco    0

The problem is probably caused by missing calls to Persistent.Modify. An object can be written to the storage and further modifications of the object (without marking it as modified) will be lost. This is why we'd like to see the code constructing this tree.

Share this post


Link to post
Share on other sites
leftwing47    0

OK, so perst is configured thus:

perst = StorageFactory.getInstance().createStorage();

perst.setProperty("perst.concurrent.iterator", true);
perst.setProperty("perst.alternative.btree", true);
perst.setProperty("perst.serialize.transient.objects", true);
perst.open(appConfiguration.getString("perst.root") + pid + ".odb", pagePoolSize);
perstRoot = (RootClass) perst.getRoot();
if (perstRoot == null) {
perstRoot = new RootClass(perst);
perst.setRoot(perstRoot);
}
Initialising a "Host" object looks like this:
public class Host extends Persistent {
private String id;
private TreeMap<String, Page> pages;
Host(String id) {
this.id = id;
this.pages = new TreeMap<>();
}
}
During my aggregation process, worker threads are spun up and effectively create the tree, using code like this:
Page currentPage = new Page();
try {
db.beginThreadTransaction(db.SERIALIZABLE_TRANSACTION);
hostCurrent.addPage("/index.htm", currentPage);
db.endThreadTransaction();
} catch (Exception e) {
logger.error(e.getMessage(), e);
if (db.isInsideThreadTransaction()) {
db.rollbackThreadTransaction();
}
}
The addPage method looks like this:
public void addPage(String path, Page page) {
this.pages.put(path, page);
}
Now, I've stripped out the various bits of locking code required for safe concurrency and I'm not seeing deadlocks or weird behaviour on that front. You mention about using calls to Persistent.modify, but does that apply to serializable transactions inside worker threads?
Does that help? Again, thanks in advance for your assistance!

Share this post


Link to post
Share on other sites
leftwing47    0

OK, so following on from your suggestions, I ran up some test classes to play around a bit more.

addPage now looks like this:

public void addPage(String path, Page page) {
this.pages.put(path, page);
modify();
}
and the tree creation code now looks like this:
perst.beginThreadTransaction(perst.SERIALIZABLE_TRANSACTION);
perstRoot.hostKeyIndex.remove(hostCurrent);
perst.modify(hostCurrent);
perstRoot.hostKeyIndex.put(hostCurrent);
perst.endThreadTransaction();

Now, some simple "create some persistent hosts with x pages", then "add x pages to the persistent hosts" seems to be persisting fine and the contents look OK.

Are these changes along the lines that you were suggesting?

Thanks!

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

×