• Content count

  • Joined

  • Last visited

Community Reputation

0 Neutral

About wilson0x4d

  • Rank
  1. Thanks, I overlooked that since we only assign a value to PArray once.. but I can see how that doesn't mean the value isn't ever paged out and back in again (thus necessitating a lock.) We've integrated this change in our code. We're still seeing this "Incompatible key type" error in our live environment. It happens once for every few thousand Get() calls. I can also get the test code to exhibit this error "inconsistently", for example on the current run I successfully added ~138,000 entries before it occurred. I had to change the number of items being inserted in Program.cs (lines 16-17), then stopped and restarted the test a few times. Other times the app was able to insert over a million items without a problem. Inconsistent results. The net effect for us is that when this error occurs we believe we are orphaning an object in the store, and/or failing to add an object to the store (depending on the exact call path from our code, it occurs in the case of TestContainer::Remove and TestContainer::Add). From BTree.cs, Lines 442-445 if (key.type != type) { throw new StorageError(StorageError.ErrorCode.INCOMPATIBLE_KEY_TYPE); } When this happens under the debugger I inspect "key.type" and "type" and they are both "tpGuid", I'm then confused why this exception is being thrown. Then I drag the "instruction pointer" back up to the conditional and re-execute and it succeeds (does not eval false the second time.) I looked at the code for getKeyFromObject and see that 'key' is (should be) a new object instance which (IMO) rules out there being a race condition on setting the 'type' property, and it also looks like BTree::type is set only once during construction as well? I'm totally confused about this error. Additionally, the null reference exception thrown by BTreeFieldIndex is also exhibiting similar behavior: From BTreeFieldIndex.cs Line 145: private Key extractKey(object obj) { Object val = mbr is FieldInfo ? ((FieldInfo)mbr).GetValue(obj) : ((PropertyInfo)mbr).GetValue(obj, null); if (val == null) When I eval "mbr is FieldInfo ? ((FieldInfo)mbr).GetValue(obj) : ((PropertyInfo)mbr).GetValue(obj, null)" in the watch window it properly extracts the key value. If I continue the debug session everything seems to go on as if nothing happened. I can work around these issues by updating our calling code to retry until it works, but this seems like a horribly inefficient hack. We would much rather understand root cause and fix it or stop doing whatever it is we're doing that is causing it (assuming it's our code/usage of the code.) That this is reproducable on 3 different machines (laptop, desktop, server hardware) and from two different EXEs (service EXE and this Console EXE) I have to rule out hardware and software environment being the cause. We've confirmed the issue on Windows 2012, Windows 8 and Windows 8.1 though I doubt these things matter. Additionally, all our test hardware has a minimum of 4 cores, but again don't know that this matters. Thanks again for any insight/assistance.
  2. Well, I may have spoke too soon. Seems that in all cases I receive "incompatible key" error, in some configurations it's just "chance" that it occurs. In other configurations I can coerce it every time I run the test tool. I have uploaded a new version of the test code that has three #define's at the top of the TestContainer.cs file, this makes it much easier to test the following: 1. .NET ReaderWriterLock for synchronization (#define USE_RWLOCK) 2. Storage.BeginThreadTransaction for synchronization (#define USE_PTRAN) 3. perst.alternative.btree property setting (#define USE_ALTERNATIVE_BTREE) This code removes the "r/w lock per-index" which consistently failed with a null-reference exception. However, if someone tells me that it should work I would prefer such a locking model since it would dramatically reduce contention in our application. New source code is: http://1drv.ms/1o98ByP Again, thanks for any insight/help on this.
  3. I think I need help understanding something about how Perst functions internally. I use reader/writer locks to protect access to individual FieldIndex<> instances, my belief was that this would provide less contention between individual indexes. Thus, there is one r/w lock per index. However, I've introduced a BeginThreadTransaction(ReadWrite) and BeginThreadTransaction(ReadOnly) wherever I have a rwlock.EnterWriteLock and rwlock.EnterReadLock (respectively.) This seems to have eliminated the error I was experiencing before, thus raising a question: Despite having multiple FieldIndex<> instances are they still sharing some internal state and overwriting one another? (Such that I cannot hierarchically lock per each FieldIndex, but must resort to a global/shared lock which encompasses all indexes.) If this is the case, is there anything I can do to isolate these indexes short of instancing multiple separate storage objects? I tried setting the perst.alternative.btree option but this didn't seem to resolve my issue. I also tried the multiclient option but this incurred the requirement that i use serializable transactions which resulted in much slower performance (by a factor of almost 200.) One of the reasons this structure uses multiple FieldIndex<> instances is that we noticed index performance degredation somewhere between 300k and 500k inserts, and we have a requirement to store roughly half a billion items (~100 million a day.) We tested the persistentHash implementation and despite being faster it seemed to leave garbage in the store (we have a unit test which fills the data structure, removes the items, then performs a backup to compact the file.. and it does this 100 times. Unlike other data structures the persistent hash would grow with every iteration.) At any rate, thanks for any additional info/confirmations/etc.
  4. I've uploaded the test code to repro this: http://1drv.ms/1jDUzDE This is built against the perst-444 generics dll it uses C:\TestData\ as a storage location.
  5. I wanted to follow up on this and note that we haven't experienced the problem again. Additionally, we discovered a 'leak' or 'orphaning' of objects in the store which was since corrected. Also as an aside, we do not catch and ignore exceptions (as a matter of practice we apply catch{} blocks minimally, preferring a service call to crash and burn.) Of those we do catch all of them are logged to a log server, I've reviewed our logs and there are no StorageError nor IO Exceptions logged. It's looking like it was an isolated case, not sure what the root cause was, but if it happens again we'll collect all the info we can and update here. Thanks!
  6. We have a data file that was returning a strange error after being re-opened (failure to cast from ObjectA to ObjectB after fetched from a PArray, where ObjectA wasn't the type being stored to the PArray to begin with but was instead a child of ObjectB). Before posting to the forums I wanted to have a console application that reproduced the error. While I cannot repro the error, I am experiencing a different error every time I execute the test app. I am hoping that this error is somehow related to the other error, but even if it is not it's something I would like to get resolved and understand before moving forward. Exception Details (verbatim): Perst.StorageError occurred _HResult=-2146233088 _message=Incompatible key type HResult=-2146233088 IsTransient=false Message=Incompatible key type Source=PerstNetGenerics StackTrace: at Perst.Impl.Btree`2.checkKey(Key key) in d:\Perst4.NET\src\impl\Btree.cs:line 444 InnerException: incidentally, there is a second exception we have experienced while running this tool: System.NullReferenceException was unhandled _HResult=-2147467261 _message=Object reference not set to an instance of an object. HResult=-2147467261 IsTransient=false Message=Object reference not set to an instance of an object. Source=PerstNetGenerics StackTrace: at Perst.Impl.BtreeFieldIndex`2.extractKey(Object obj) in d:\Perst4.NET\src\impl\BtreeFieldIndex.cs:line 145 at Perst.Impl.BtreeFieldIndex`2.Put(V obj) in d:\Perst4.NET\src\impl\BtreeFieldIndex.cs:line 230 at ConsoleApplication1.TestContainer`1.Add(Guid key, TValue value) in z:\Code\wilson0x4d\Perst\ConsoleApplication1\ConsoleApplication1\TestContainer.cs:line 244 at ConsoleApplication1.Program.PopulatorThreadMain(Object state) in z:\Code\wilson0x4d\Perst\ConsoleApplication1\ConsoleApplication1\Program.cs:line 117 at System.Threading.ThreadHelper.ThreadStart_Context(Object state) at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx) at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx) at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state) at System.Threading.ThreadHelper.ThreadStart(Object obj) InnerException: I'm not sure how to post the source code for testing this issue, it's consistent, though.
  7. We don't have a repro case, but of the 15 machines this code was deployed to only one of them experienced this problem (and the only difference between it, and the others, was that it ran out of disk space due to losing network connectivity with an upstream host.) it was a hasty assumption on my part that the data file had been corrupted. I can confirm you are correct that the data file is not corrupted, we attached it to another instance in our test environment and after about 1 minute (presumably time spent recovering an incorrectly closed file) there were no complaints from the service. We then spun up the service that had this problem and it has not complained once since. (We shut it down until we could investigate further.) I'll also check the build config and kick off a new build before re-testing. It's possible whoever configured the build decided to do something the wrong way and Asserts are likely not emitted in the build. After reviewing the code I have to agree that the assert should have failed. Also after reviewing the code (correct me if I'm wrong) my understanding is that because we specified a page pool of fixed size, after experiencing 0 bytes free on the disk and being unable to persist pages to disk the page pool exhausted (resulting in this error)? Thanks for the info on #2, we now have two reasons to write an internal tool (understanding that it would be specific to our schema/model), firstly for performing 'compaction' and secondarily to perform 'last chance recovery' of a corrupt data file. Realizing the complexity of performing a consistency check I still had to ask it's certainly not impossible, but it would likely turn into a monstrosity of code which McObject would have no real value in producing/maintaining. I'll let you know if I can repro the issue for your review, but I think I understand the root cause at this point. Thanks!
  8. We've experienced an interesting cast exception: Type = System.InvalidCastException, Message = Unable to cast object of type 'Perst.Impl.LRU' to type 'Perst.Impl.Page'., StackTrace = at Perst.Impl.PagePool.find(Int64 addr, Int32 state) in h:\Jenkins\workspace\Perst4.NET\src\impl\PagePool.cs:line 90 at Perst.Impl.StorageImpl.getPos(Int32 oid) in h:\Jenkins\workspace\Perst4.NET\src\impl\StorageImpl.cs:line 178 at Perst.Impl.StorageImpl.storeObject0(Object obj, Boolean finalized) in h:\Jenkins\workspace\Perst4.NET\src\impl\StorageImpl.cs:line 5056 at Perst.Impl.StorageImpl.storeObject(Object obj) in h:\Jenkins\workspace\Perst4.NET\src\impl\StorageImpl.cs:line 4961 at Perst.Persistent.Store() in h:\Jenkins\workspace\Perst4.NET\src\Persistent.cs:line 82 at Perst.Impl.StorageImpl.Store(Object obj) in h:\Jenkins\workspace\Perst4.NET\src\impl\StorageImpl.cs:line 7622 at Perst.Impl.LruObjectCache.flush() in h:\Jenkins\workspace\Perst4.NET\src\impl\LruObjectCache.cs:line 249 at Perst.Impl.StorageImpl.Commit() in h:\Jenkins\workspace\Perst4.NET\src\impl\StorageImpl.cs:line 1531 It would seem that the map is desynchornized/corrupt, and now Perst is unable to pull out its own internal types from the database? We believe this was caused due to a disk space issue, so, we have a few questions: 1) Is there any way for Perst to handle a low-disk-space scenario, such as refusing to commit (which would be preferable to corrupting the data) or other in-built solution? 2) Is there any way to perform a consistency check and fix this corruption? For example, a tool that can see that the map points to a type of T, but when inspected/hydrated the type is NOT of type T. 3) Is there anything else we can do to prevent this kind of corruption from occurring? Thanks for any insight.