Web3 CMO Stories

AI and Web3 Synergy: Flock.io's Approach to Secure Model Training – with Jiahao Sun | S4 E29

August 09, 2024 Joeri Billast & Jiahao Sun Season 4

Send us a text

What if you could harness the power of AI without compromising your data privacy?

Join us as we welcome Jiahao Sun, the trailblazing founder and CEO of Flock.io, who shares his incredible journey from Oxford to leading innovative AI initiatives in the financial sector. Jiahao reveals the birth of Flock.io, a groundbreaking platform that ingeniously marries federated learning with blockchain technology, ensuring sensitive data stays secure and local.

Dive into the importance of community-driven AI development and understand the potential hazards of data misuse, which Jao compares to the perils of bioweapons targeting DNA.

Explore the revolutionary applications of decentralized AI across various industries as Jiahao highlights secure AI companions, financial transaction bots, and groundbreaking healthcare solutions like glucose monitoring. Discover the crucial role of privacy in these applications, particularly within the Web3 landscape. Get excited about the future of AI and metaverses with upcoming advancements like GPT-5's enhanced multimodal capabilities.

Finally, understand how cutting-edge technologies like zero-knowledge proofs and fully homomorphic encryption will reshape privacy-preserving machine learning. Learn how you can engage with Flockio's unique offerings and be a part of this transformative journey.

This episode was recorded at EthCC in Brussels on July 9, 2024. Read the blog article and show notes here: https://webdrie.net/ai-and-web3-synergy-flock-ios-approach-to-secure-model-training-with-jiahao-sun/

Jiahao:

AI models are only good when it comes to those real, private, high-quality datas. So that drew me the consideration and thinking about how we can have a system or a new technology that can actually solve this problem.

Joeri:

Hello everyone and welcome to the Web3 CMO Stories podcast. My name is Joeri Billast and I'm your podcast host, and today I'm at ECC in Brussels and I'm joined by Jiahao Sun. Hi Jiahao, how are you doing?

Jiahao:

Hey, bonjour, very well, thanks. Thanks for having me.

Joeri:

Happy to have you guys. If you don't know Jiahao, he's a founder and CEO of Flock. io, a community-driven platform facilitating the creation of on-chain decentralized AI models. So, Jiahao, welcome to the podcast. To start off, could you share a bit about your journey and what inspired you to create Flock. io, and also feel free to give us more explanation about your company?

Jiahao:

Yeah, sure, yeah. So a little bit about my background. I graduated in computer science from Oxford and then, straight out of school, did a bit of startups and then got into the big corporates, actually becoming their director of AI in traditional financial industries. That's my 10 years back history of my career. Like all in AI before and then in 2022, I was yeah.

Jiahao:

Because of all this experience, I figured out that the thing happening in traditional industry, especially when it comes to what we call sensitive industries, where data becomes so important, right, none of them are willing to share any data with each other. But when it comes to AI, it's always good to have more data to train AI models. How can you solve this? Because every of the banks, every of the institutions, they have their own data silos. They can keep them safe within their own walls. They don't want to share with others. But AI models are only good when it comes to those real private, high-quality datas. So that drew me the consideration and thinking about how we can have a system or a new technology that can actually solve this problem and make the models easier accessible to large quantities of data, instead of having to put them in risk of any data privacy leaks. So that's Flock, basically. So that's why we started Flock.

Jiahao:

Flock actually stands for Federated Learning Plus Blockchain FLOCK. Also, it means a flock of data, a flock of nodes together. Right, it's a technology where, actually, google introduced in 2017. Nowadays, when everyone's using phones Apple or Google phones, right, android phones, sorry you are actually using federated learning. For example, some personal recommendations or input predictions for your typing habits, and all those data are claimed by Google that's securely stayed on your local phones, right, but you never know. How do you know that Google is not playing evil, right? That's why we proposed blockchain to join this federated learning system.

Jiahao:

They don't have to believe in any third party. Believe in any third party, that's saying that they have the super governance of third party. That's saying that they have the super governance of any training that's decentralized. We actually have blockchain to do the governance for us. So that's what we call flock. Yeah, beginning of the journey we did. We draft papers, we write papers and publish them. Those are our very first priority to make sure our stuff is peer reviewed in top academic journals and conferences. That's the first thing, actually. And then we start to build up our code bases, make them quite robust and scalable and actually can deliver to the market. So it's been through quite some time and then now we actually released the whole decentralized training platform. Many of our clients are building their models and a community that are working on their models on our platform right now. It's super.

Joeri:

That sounds amazing for me. Like I mentioned to you when you were preparing the episode, that I love it when technologies come together, like AI, Web3, decentralization and also the aspect of the protection of personal and sensitive data. I think it's really important. Can you talk a bit more about that?

Jiahao:

Oh, about data, right? Sensitive data, yeah. So personal data is so important. Nowadays we know that AI can actually create a digital twin of you, right? Or actually using your voice to even talk to people, scam people around.

Jiahao:

It's so important to have your own data security on your own place, or even, if you are going to contribute into something, that you have your control of how much level of access that others can get and whether you can keep everything safe in your own place. So, as we said, Flock is one of those technology where your data stays local. We don't actually put your data upload to any third party. The model actually runs locally, and then to get a model change or we call it model updates and then we only just do these updates, some of those updates with others, and then the model, a general model, will then get updated, right? So it's a way to have a layer of defense to make sure your local data don't actually leak.

Jiahao:

So, private data, yes, it's very important because, when we think about this way because AI is getting more and more important and more and more powerful in the future, right, your private data might essentially becoming your DNAs. You can imagine in one day if biotechnology becomes super, super powerful and then your DNA become a key to lots of things, then, when it comes to some bioweapons, it will be super, super dangerous. They can targeting on your own DNAs and your data is just arriving a bit earlier nowadays. Right, when AI comes, kicks in, your data becomes your DNA and they can actually use your data to create something that's targeting to you. That's super super dangerous.

Joeri:

Thank you so much. We were just interrupted by people entering the room. The AI will solve that, of course. Yeah, thank you, Jiahao. What's, of course, an important aspect of Web3 and also the fact that we are here at the ECC conference all about community. Flock. io is described as a community-driven platform. How do you engage and empower your community to contribute to the development of decentralized AI models?

Jiahao:

Yeah, I think. First of all, I think, the beauty of blockchain itself, apart from governance, and also the mechanism that can help the general public to actually participate and govern the training process right. The other part of it is the incentive. I can see clearly the audit trail of everyone who actually contributed to a model, either by contributing their data or contributing their models or their brands. Right, it's a great way that we have a very transparent, so everyone who actually contributed, they can get incentives about their attributes. And also in the future, when the model was used in multiple different places, then you will have distribution of your incentives. Future, when the model was used in multiple different places, then you will have distribution of your incentives. For example, three percent of the model transaction is made by you. Then you always have that three percent of your incentive so that when you train a model, when you deliver a model, you can always see that okay, we have. I have a long-term relationship with a model because I always get those incentives. I can build some long-term strategy of a model training.

Jiahao:

That's the beauty of blockchain, which is giving more value add to this AI industry, and also when it comes to I would say, when it comes to community-owned models. That's actually super important, because we all know the scandal earlier by Google Gemini, right, the AI that actually creates some very biased photos about people. It's super bad. Sometimes, pichai came out to apologize for this big PR issue of their company. Yeah, so that's what happens when you rely everything to a centralized company to do this model training for you. So that's why we propose this democratization of AI, where the creation of the model actually relied on so a group of people. They might have their own ideology or their own preference when it comes to evaluating the model. Yeah, they should do it their way. We shouldn't have just one set of people sitting somewhere in San Francisco to guiding us, guiding the whole world, how AI should react. Right, we should have the general public. They need to decide what their model looks like. So that's, yeah, what we saw and that's actually what Web3 can enable the industry.

Joeri:

Yeah, it's exciting Everything that Web3 can enable, and, I said, when it comes together with AI, then it's even more powerful. What are maybe some of the most exciting technological advancements in decentralized AI that you are seeing right now, and maybe how Flock. io is contributing to this?

Jiahao:

Yeah, so I think since GPT, lots of attention has been put into the AI space and while there's tons of interesting innovations in the space, not only just from the infrastructure side a lot of GPUs can do a decentralized GPU clustering, or the data doing decentralized data storage but also from the algorithm side, there's lots of solutions that people are trying to tackle. For example, flock is one of them, trying to use this sophisticated layer of algorithm that can actually facilitate the training and, together with the GPUs, decentralize in the whole world and then utilize them to do the training that you don't even need to have a computer at home yourself, but you can still join this decentralized training. Yeah, this is one of them. And also there are other technologies.

Jiahao:

Well, people might heard of a lot of them, like FHE, zk, zero knowledge proofs, right and many of those trying to make sure that data transmission becomes super, super private, that they don't need to worry that my core data has to be revealed when it comes to the machine learning training. And we are supporters of all this fancy technologies and we even publish papers about ZKFL, like using FLOG as a mechanism and then having ZK to be one of the privacy security methodology to even more, enhance the whole training process. So in the future maybe, when I don't know, quantum become a thing right, quantum computing, quantum security, so it's a quantum flock or something like that. So yeah, there's been a lot of interesting technology nowadays in this space to making decentralized compute much more possible or feasible than before.

Joeri:

Maybe also interesting for our listeners. Can you maybe give some examples from the industry or from clients, or about decentralized AI and how it is used?

Jiahao:

Yeah, in the industry for decentralized AI, the very big use case actually on our network right now is the AI companions, the companion applications. Right, because, yeah, you can imagine, right, there's people want to have someone to talk to, but they don't want to have someone who only just react to some open domain data because that's not themselves. Right, they want someone who actually understands them. But by doing so, you need to put all your data over to tune a model that actually understands you. It's risky in many cases. Many of the people don't want to do that, right? Yeah, so on our platform, this is a solution where you keep your data safe, but you can still have a very good assistant to understand everything about you. So that's one of the use cases which has the highest volume. But also there are other cases where we have highest like a public good or highest revenue stream, for example, the bots. They're helping you do transaction on your intent. You only just need to type down some natural language and it will do all the bridging transactions.

Jiahao:

I believe in this community, everyone lost some money on bridging because this is super, super complicated, right? And yeah, that's actually one of the biggest revenue stream use cases for us because all this financial data, it's hard for people to share it. Right? It's like I don't want to share my wallet with you. I know it's open, but not open to you or not open to a third party, because otherwise you know all of my financial background. So they want a way that can do it privately but can still reflect their interest in terms of bridging.

Jiahao:

Yeah, and also some Web2 cases, which in the early days, we actually work with hospitals across the world to do this glucose monitoring for the patients, so they don't have to send their data across border or to any central server. We can use FLOG to do this facilitation and then we can still fine-tune a model to help them to predict their glucose level in the next hour or so so that they know maybe they need to take insulin. Yeah, so many of these cases, I think, especially in these highly sensitive areas where flock can run very to preserve those privacy, to make sure the model runs in a more larger scale, decentralizedly.

Joeri:

Yeah, privacy trust in this world, people in Web3, like you say, there are so many scams and people are not sure who to share their data with. What are they sharing? Something that I personally also find interesting is the trend towards the world, virtual worlds, where there is also the aspect of privacy, of course, and I love it, like I said, when technologies come together.

Jiahao:

Metaverse yeah, nice term metaverse it's a hot topic last year and then suddenly nobody talks about it. This is probably our industry. Always there's a hype of one thing and then suddenly it go faded. But AI is a long-term stream. I think that's no doubt because, yeah, like topics can fade very quick, but AI will be a very long-term topic because it's not just happening in Web3. It's like happening in both Web2, web3 and in every industry, everyone trying to look into it right and apply it or even research on it. So it's a long-term thing first of all. Secondly, for the yes, metaverse. I'm happy to talk about this problem because in the early days, metaverse was like, say, it was more like a game place for people to play or a digital twin. There was a time I think it's even before, while I was in my previous jobs in banks even the banks were talking about how can they do a digital twin of their clients of a specific city so that they can actually have a better knowledge for predictions when it comes to ultra high net worth clients or when it comes to some financial situation change of their clients. Because it's digital, it's there's a digital tongue that they can do simulations on right. It is cool, just just a lot earlier last year when Stanford Tongue came out as a game. Actually, it's like it's a beautiful paper where they use GPT as a driver for all the characters in a small town and then they just leave their way.

Jiahao:

What's the TV drama called Westlife or Westworld? Westworld, yeah, yeah, yeah, like everyone's actually NPCs, but they have their own thoughts right To talk to each other. So, yeah, I think AI in this at today's stage, can much hugely drive the metaverse world. I guess you are quite interested in this place as well, right? So, not only just two years ago, not only just some 3D scanning of your body and that's it.

Jiahao:

Right, they now can have kind of the thoughts of you if you are willing to using vlog, maybe to layer out your data so that your data stays local, but you can still have digital twin that run in a metaverse and run all the what I say, that activity is just like you do, right? Yeah, I, I can see that happening. And even for the creation of metamers sorry, even for the creation of metaverse um, you might also don't even need some ui ux designers in the future. Right, it can be just auto-generated. It's a generated verse with generated characters which can make mapping to some of your friends who registered and then you can just play, say like happily in a virtual world. It's cool. I think AI is giving this industry a lot of imaginations. I'm not quite familiar with the industry of metaverse nowadays, but I think there will be some of them that came out from the crowd very quickly, with all the superpower joining by superpower algorithms joining them to create a better metaverse. Yeah, what are the best ones?

Joeri:

the best ones. That depends on where your audience is, what you want to do with that. There have been different people already as a guest on my show from different metaverses. If it's more about gaming, we had the sandbox sap for bogey from the I think he's around also at ECC I was on the show. You have the Tangra Metaverse, which is more for education, for universities or even business schools. I even gave a course for Rutgers Business School in the US in the metaverse, actually about the metaverse, and then you have, of course, also Spatial. I already did networking events for businesses, but it really depends.

Joeri:

I was also at Fashion fashion week in the central land. That's another type of metaverse. So there are like the Web3 metaverses and the non-Web3 metaverses. But if people are now listening and they're interested in that, there are for sure other episodes on the show that you can check out. Also, what I'd like to know from you, because a lot is changing very fast what is coming up next in this year or maybe next year that you see happening in this space? So maybe look a little bit in the future.

Jiahao:

A little bit in the future. Yes, of course. I think one of the most interesting thing to me is GPT-5 and the capabilities of GPT-5, which I believe it will include the multimodal capability. By multimodal it typically means a bot that can understand not only just context of the textual languages, but also a picture, also a clip of videos or even the music, and to understand, to answer the question about a music even what's your thinking of the music? Something like that. And it's not only just that, it also comes with some what we call the thinking of the object, so they can, for example yeah, it's a bit unconcrete, but I've done this in my early days. It's actually a task called a VQA, virtual question answering. It's an updated version of Turing test because somebody gamed the Turing test already, so we have an even better version of Turing test, visual Turing test.

Jiahao:

So actually it's a photo of a lady with a banana here above their mouth, right? It's just a fun photo. And then the question will be what's what made of the mustache of the lady? This is a very tricky question. It's so tricky for the bots to answer at that time, right. And then nowadays, with multimodal, they need to understand, understand. Oh, when they talk about mustache, it means a specific area of the face, right, and what's, what are they made of? Normally it's made of hairs, but, uh, this time it's a banana, yeah, so it's quite a complex thing. It's not just like I can process a photo and say it's a cat or a dog. It's more about imagination, even about thinking of the photo. What's actually inside the photo? It's happening, and then it's, um, yeah, I think they thinking of the photo, what's actually inside the photo it's happening, and then it's, um, yeah, I think they were like beating those vqa type of tasks already. Um, so, so, yeah, this is very important, an interesting thing happening in the near future, which I guess next year it will come out and it will again back to our question. It boosts a lot of the metaverse now. So you can so, with the ai in the metaverse, they can capture whatever they saw in the metaverse and then to do their reasoning themselves. And those reasonings are not just some simple question, answering it's a cat, it's a dog, right, they even do reasoning about, oh, whether there's a banana or it's a, something placing somewhere else, what's the even? What's your imagination of a cloud in the sky, right, so it's, yeah, it's the next level type of thing. So that's one way in the AI space, which I think will be in the near future. It's almost certain it will happen in the further future.

Jiahao:

I guess what I interest me a lot is, as we mentioned, zero-knowledge proofs and FHEs. Yeah, yeah, because they are super in my term, super sexy in terms of the privacy-preserving methodologies, right. But the only thing now is they are super expensive to compute. So you can't really imagine when we do a zero-knowledge proof machine learning or when we do fully homomorphic encryption machine learning. It might take 100 years to actually just run one algorithm, right, but but in the future years I think this cost will go down quickly and then I'm very excited to see when one day really we can have this type of, this level of encryption happens in machine learning.

Joeri:

Yeah, yeah, that's zero knowledge. Proof that, actually, when we are now recording this podcast today, this is actually the next podcast episode that will be released with someone that is talking about ALEO. So yeah, you know them. Oh yeah. So, guys, probably you want to hear more about everything that Jiahao is doing to Flock. io. So, Jiahao, where can I send people? Where can they find you? Where can they learn more about flockio?

Jiahao:

thank you. So that's the reason why we name ourselves Flock. io, right? So that's exactly the website itself. So, basically, flock. io, you just log into the website. There's a lot of interesting things happening there. So we have a marketplace for our bots, which has already been trained. It's on the beta. flock. io site or, if you want to train your own model, it's on our trend. flock. io site, but everything's on our main website, so you can just browse around. You can join the training yourself or you can just play around with the bots. Other people have already been developed. So, yeah, welcome to join our community. Of course, great.

Joeri:

So thank you so much, though it was a pleasure talking to you. Thank you, thanks for having me guys. What an insightful episodes. So the first episode here at eccc. There will be a couple more that will follow in the today and in the next days that I will be recording. Thank you so much for listening. If you think this episode is useful for people around you, be sure to share the episode with them. If you're not yet following the show, this is a really good moment to do this. If you haven't given me a review yet, this would really help me if you give me these five stars to reach even more people. And, of course, I would like to see you back next time. Take care.

People on this episode