Skip to content

Support for large downloads in Complex Portal #29

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: develop
Choose a base branch
from

Conversation

jmedinaebi
Copy link
Contributor

@jmedinaebi jmedinaebi commented Apr 17, 2025

I have made changes to the Complex Portal endpoints to:

  1. Return objects instead of serialising them to string before returning
  2. Use DeferredResult to make search and export endpoints asynchronous with a timeout
  3. Use threads and CompletableFuture on export to parallelise fetch and serialisation of complexes

Changes are deployed in http://ves-hx-47.ebi.ac.uk:8110/intact/complex-ws/

CompletableFuture<Void> allFutures = CompletableFuture.allOf(futures.toArray(new CompletableFuture[0]));

// The writer needs to be created before the async logic to use the right XML writer factory
SerialisedComplexesWriter mainComplexWriter = ComplexWriterFactory.getSerialisedComplexesWriter(format);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the comment says, the writer needs to be created before the asynchronous call. If it isn't the XmlOutputFactory (or something like that, I don't fully remember the name) creates the wrong XmlWriter and there are issues. The classes registered are not the same at this level, then inside an asynchronous thread, which causes those issues.

List<CompletableFuture<String>> futures = new ArrayList<>();
for (List<String> complexesAcsChunk : complexesAcsChunks) {
// The writer needs to be created before the async logic to use the right XML writer factory
ComplexWriter complexWriter = ComplexWriterFactory.getComplexWriter(format);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

private final IntactDao intactDao;

@Transactional(readOnly = true, propagation = Propagation.REQUIRED, value = "jamiTransactionManager")
public String fetchAndWriteComplexes(Collection<String> complexesAcs,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is in a separate class to be able to use the @Transactional annotation. With asynchronous methods and parallel threads I was getting some errors with the DB about lazy collections and no session. This method is annotated with Propagation.REQUIRED so it ensures there's a session or creates one.

import java.util.stream.Collectors;
import java.util.stream.Stream;

@Log4j
@Controller
@AllArgsConstructor
public class SearchController {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Large diff in this class, but it's mostly cleaning up and moving stuff to other classes. Now the controller is mostly jus the endpoints.

import java.io.StringWriter;
import java.util.Map;

public class ComplexWriterFactory {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have 2 types of writers.

  • ComplexWriter and classes that implement it: writers that write complexes into a string
  • SerialisedComplexesWriter and classes that implement it: writers that add serialised complexes strings into the right format, with the right header and footer just once.

@@ -17,16 +17,25 @@

<Configure id="Server" class="org.eclipse.jetty.server.Server">

<Call name="addConnector">
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes related to jetty plugin update.

Copy link

@EliotRagueneau EliotRagueneau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking all good to me, really clear PR despite a complex subject. I assume you tested the performances and observed significant improvement?

@jmedinaebi
Copy link
Contributor Author

Looking all good to me, really clear PR despite a complex subject. I assume you tested the performances and observed significant improvement?

It's deployed in https://wwwdev.ebi.ac.uk/intact/complex-ws/ so you can try it out if you want, or directly from the complex portal view in github pages, https://complex-portal.github.io/complex-portal-view/home, which is currently calling the dev service with no limit on the number of complexes to download.

I did try a few times and if I remember correctly I managed to download all the complexes in different formats. Definitely more than the 5k limit we currently have.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants