Cloudflow applications deploy model

Hello

First of all I have to say that I use the modular approach when creating Cloudflow applications, that means I create an individual module for each streamlet and aggregate them together under single root project.
All Cloudflow project components are chained together using blueprint.conf and whole application builds and deploys atomically. If I need to make any (even minor) change in a streamlet I have to build and re-deploy the whole app which looks monolithic and too cumbersome in some cases.
So are there any approaches to split a single application into few sub-modules which can be deployed independently while still using a same blueprint?

1 Like

Hi Alex

the current version of Cloudflow does indeed force you to redeploy the entire application in one step, even the Cloudflow operator does try to only make changes and restart pods where necessary.

That said, we are indeed aware of this limitation and we are making slow steps towards more individual control while keeping the central role of the blueprint.

We have started work on a new feature that will allow you to split the Cloudflow application into multiple individual images, which can then be linked together through the blueprint. Deploy will then focus on the blueprint instead of on one single image and will then only change the part of the running application that now have a different image/tag associated with it.

When it comes to allowing for individual deployment of single streamlets/images without going through the blueprint, that is something we haven’t yet seen a good solution for, e.g. how would the blueprint on the file system (or in git, Github, etc.) be updated if we allow for replacing parts of the application through a side-channel?

We try to focus on allowing for a GitOps model where the blueprint is central. That said, we are definitely open to new ideas on this so please let us know what you think.

Age

1 Like

Hello Age,

Can you please tell me if I can find a roadmap for the further Cloudflow development?

This feature looks very promisingly, though this is a possible way to separate massive pipelines into smaller and fine-grained once.

Is there any documentation that describes such behavior available? Thanks!

I totally agree on this point, the approach concerning separate sub-images connected under one blueprint looks much more confident.

I’m working on one right now that I hope to publish next week.

Not yet I’m afraid. We have a backlog item to create architecture documentation that describes how things are implemented, what happens during deployment, etc. We hope to pick this up soon. Until that time we are happy to answer any direct questions here, on our Gitter channel, or through Github issues.

I’m glad you like the multi-image approach. We hope to release that in the next few months. You can see some of the early work already being committed to the repo this week.

Age

Age

I’m very appreciated for the detailed clarification!

I have one more question concerning the use of Schema Registry or, mostly, about whether the schema management process will be implemented since I’ve heard a qustions about it during some webinars. Should I start a new topic?

I would be interested in more details about the features that you are interested in. I can tell you that we do plan on making some minor changes to how schemas are linked to streamlets to allow for possible downloading schemas directly from a schema registry during building.

We have also looked at the various problems surrounding schema migration and data migration for backwards incompatible schemas but that is quite the complex issue so I’d be very interested to know which features you would like to see us implement in that context.

Age

This is probably the most anticipated feature we’re waiting for - a kind of central schema registry. I think it fits well and looks aligned with the ‘sub-image’ feature that you’ve mentioned before.

Currently we don’t have any issues due to the lack of the real production usage, but I will come back to you with any possible news on it, thank you!

So, just to confirm, we are talking about the ability to maintain the master schemas in a registry and then download them from there through some kind of sbt dependency, correct?

If so, are you using the Confluent schema registry? that is the only one we are aware of so if you are (thinking of) using something else that would be interesting for us to learn about.

Yes, you’re absolutely right. Confluent Schema registry seems to be the only solution in that case.

Thanks. That is definitely on our backlog to implement in the next couple of months or so.

Age

Hello Age,

Can you please tell me whether the roadmap is availbale now? Thanks.

Not yet. I’m working on it.
In short, these are the major things we are currently working on or planning to work on soon:

  • support for blueprints with references to multiple app images, e.g. deploy the blueprint instead of deploying a single combined image (work started)
  • an overhaul of the configuration system so you can pass more kinds of configuration at deployment time and in a more convenient way (e.g. files). This will allow you to configure the underlying runtime, like Akka/spark/Flink properties.
  • open sourcing our operator-based installer and adding support for upgrades
  • schema registry support

There is more but that is still in various phases of ready-making.

Hope this helps,
Age