-
Notifications
You must be signed in to change notification settings - Fork 124
New image buffer #570
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
New image buffer #570
Conversation
…oreAndDevices into data_buffer_new
…nters to java longs
…ure metadata is accurate for v2 buffer images
…c; Also centralized Metadata generation into the core from SWIG, core callback, device base
I moved that method to the core and out of the SWIG layer, but will keep it indefinitely for backwards compatibility |
Added system state cache to summary metadata: micro-manager/AcqEngJ#127 |
Okay I think I've addressed everything except for the two remaining unresolved comments above. I think what to do about acquiring write slots can be addressed in future PR, but it would be good to figure out what the eventual strategy will be
This was indeed an accidental omission. I've added them to the interface, but this is commented out for now
I considered creating an intermediate compatibility buffer that would temporarily store data when a camera acquires a write slot with I think the path forward is to enable write slot acquisition in a future PR after additional testing, with the requirement that camera device adapters using the acquire/release slot feature must use the new buffer. Since the performance testing shows |
I'm finally getting around to looking at the details of the image retrieval API for the V2 buffer. Please correct me if I'm misunderstanding anything below. Using the new mechanism, the app (let's look at Java for now) calls On the other hand, if the app obtains a It's also generally hard to see what all the problems are that could arise from the lifetime management (or lack thereof) of buffer slots -- a problem in itself. I think the only safe way to deal with this is to explicitly share ownership of the buffer slots (including the memory backing them) between MMCore and the app. This could be done by managing the slots with After having written the above, I realized that you don't have separate buffers for each slot (like the V1 buffer) but rather one big, contiguous buffer. You can still have (It is not clear to me what the advantage of the contiguous buffer is. You end up using As for the Java API for things that need to be explicitly "released" by user code, it should conform to the (Ideally we also automatically release the I'm afraid I cannot recommend merging this until these buffer lifetime issues are addressed. If possible, it might be productive to split this PR into two: one that cleans up the metadata handling without introducing the V2 buffer, and one that purely introduces the V2 buffer. That would speed up reviewing the changes to the metadata handling (which I still need to take another look at -- I'm mostly happy with it but it's easier to make 100% sure there are no unknown changes in behavior than to later troubleshoot the existing (sometimes hacky) application code that might depend on exact behavior). |
Thanks for taking a look. Your understanding is mostly correct -- but in a couple places I think you've misunderstood, and in fact the behavior you're advocating for is already implemented.
Correct
The clearing/deletion of the v2 differ is handled differently than v1 circular buffer for exactly this reason. The v1 buffer clears/is reallocated every time before starting a sequence acquisition. The v2 buffer does not. For v2, we've now split into two separate operations: clearing and resetting. Trying to clear when there is application code that holds outstanding slots will throw an error. Reset is the more dangerous operation that has the problems you mention, which is why its not simply slotted in eveywhere that the old circular buffer used to be cleared. For example, for the case of changing the buffer size, we have: void BufferManager::ReallocateBuffer(unsigned int memorySizeMB) {
if (useNewDataBuffer_.load()) {
int numOutstanding = newDataBuffer_->NumOutstandingSlots();
if (numOutstanding > 0) {
throw CMMError("Cannot reallocate NewDataBuffer: " + std::to_string(numOutstanding) + " outstanding active slot(s) detected.");
}
delete newDataBuffer_;
newDataBuffer_ = new DataBuffer(memorySizeMB);
} else {
delete circBuffer_;
circBuffer_ = new CircularBuffer(memorySizeMB);
}
}
You'd have to call
This is essentially what already happens (though not with
Empirical testing indicated that the currect mechanism of memory mapping a large buffer had the best performance. True abou the fragmentation, though I don't think this is so likely to happen in practice (I can provide more detail if needed). In any case, this is an internal implementation detail that can always be changed in a future PR without breaking backwards compatibility.
In my opinion, 32 bit support should be retired
I went back and forth on how forgiving the design should be about forgetting to call I would say its better to not upgrade to Java 9 first in case that brings other unforseen issues.
I understand the motivation for this, but unfortunately, I think this would be very challenging. The metadata generation was tangled up in many other functions. I tried to do this very carefully to avoid unexpected changes in higher level application code. It will at least be straightforward to implement fixes once identified since it is all centralized now. We could consider default including legacy metadata (even though it would give a performance hit) on the v1 buffer so that unexpected things don't break Update: I split out changes to the circularbuffer behavior into #588 |
@marktsuchida I've made changes based on our discussion yesterday:
Also some further explanation on this:
Note that the contiguous buffer is memory mapped, so its not the same a regular contiguous allocation (which i tried first and was very slow). The combination of a contiguous memory mapping + slot management system (free regions, etc) is efficient because you never have to create new heap objects. The circular buffer is much slower to initialize (see graph above), especially for small image size, because it pre-allocates many frameBuffers to hold its images. I'm not sure how the new buffer could maintain flexibility to different image sizes yet not suffer this pre-allocation penalty without the current strategy of allocating a big block |
@henrypinkard This is an amazing alternative! Is there any compiled micro-manager version that implements this new image buffer? Or it has to be self-compiled from the source? Thanks! |
No, you have to compile the core, the core wrap (either mmcorej or pymmcore) and the new device adapters from source. Note that there's not yet support for the new features in AcqEngJ. So you may want to compile MMCoreJ and its Jar wrapper, then modify AcqEngJ to have it make use of the generic data handling capabilities. You can test AcqEngJ through the |
I managed to compile the core wrap and device adapter. However, even without enabling new image buffer, the micro-manager doesn't work in live or MDA, while only work at Snap mode. I supposed that if new buffer is not enabled, it should function as the normal micro-manager, right? |
Yes. Maybe try downloading the nightly build from when this PR was opened and using that as a starting point? |
I tried downloaded the nightly built version built before Feb 17. Unfortunately, it didn't work. The snap image works fine always while when Live is on, it is stuck there. And also, the Sequence Buffer Monitor seems piled up and then stuck. I am not sure what happened. The core log actually also gave nothing output. |
Here's my full install with the Core and demo camera built from source, which works with the Demo camera: https://drive.google.com/file/d/11wLbqtzeYJAIbQjsw2-Q9slIM5fXGi7w/view?usp=sharing |
Thank you so much, Henry! I will work on it and see how it goes. |
The V2 buffer provides thread-safe, generic data storage with improved performance and cleaner abstractions.
Before merging
Design
Two core components:
DataBuffer: Thread-safe generic storage replacing CircularBuffer
BufferManager: Unified interface managing both legacy and new implementations
Key features of the new buffer system:
It can be enabled with:
Performance
As a drop in replacement for the circular buffer (i.e. copying the data same number of times, but allowing for arbitrary size and data types), the new buffer gives equal or better performance:
In sequence acquisitions:

In continuous sequence acquisitions (live mode):
It's significantly faster to allocate
Additionally, it has two key features that will enable much higher performance code:
Testing
I've written and validated the new buffer and circular buffer against many new tests here. (FYI these live in mmpycorex so they can easily test both MMCoreJ and pymmcore)
It also passes all the pycromanager acquisition tests, which test the various functionalities of the acquisition engine
Metadata
In conjunction with these changes, it made sense to standardize the metadata added to to images. This was previously split amongst several places, making it hard to keep track of and maintain, including the SWIG wrapper, the core, the corec allback, and the device code. Some of it was generated at the time of image acquisition, and some of it was generated at the time of image retrieval.
It has now all been consolidated into
void CMMCore::addCameraMetadata
, and the same metadata is added to all images whether snapped or passed through a buffer (with the small exception of some multi-camera device adapter-specific tags).Testing reveals there's a substantial performance cost to adding so much metadata to all images:
Previously, much of this cost was incurred when reading images back out of the buffer. With the new changes, it is incurred at the time of insertion. However, I think it makes much more sense that this metadata is added at insertion time, because that's when its most likely to be in sync with the actual state of hardware.
Since this consolidation takes place outside the BufferManager, it also affects the circular buffer and will change behavior even if the v2 buffer is disabled. We need to figure out what should be enabled here. It's unclear (to me) what higher level code depends on what tags, but including the union of all of them by default will substantially hurt performance. I also have just a temporary function in the core API for controlling which metadata to add, which should perhaps be replaced with something more permanent.
Multi-camera
While it is possible to use the v2 buffer with multi-camera devices, since its flexibility is a more general solution (e.g. supports different image sizes, types, etc) to than the multi-camera device adapter, in my opinion that should be deprecated and application code that relies on it updated to the v2 buffer.
One addition here is the
getLastTaggedImageFromDevicePointer("cameraLabel")
, which enables you to get the last image from a specific camera, rather than having to search backwards through the most recent images and read their metadata.A step towards a single route for all data
The pointer-based API gives a good opportunity to start moving towards a single route for all data, rather than a separate route for snap and sequences. I don't think its possible to fully do this without changing how cameras handle data for snaps, but in the mean time the
GetImagePointer
function now copies the snap buffer in camera adapters into the v2 buffer, returning a pointer to it. This should be faster than copying into the application memory space because it can be multithreaded, and still allows the pointer-based handling of the data from the application layer.Pointer based image handling
You get these through methods like
getLastImagePointer()
, which return aTaggedImagePointer
object. This object is a wrapper around theTaggedImage
object, but it will not load the pixels until you callgetPixels()
, or if you never want to use them you can callrelease()
, or just use the metadata without pixels like: