Lior Hakim, Co-founder and Chief Technical Officer of Hour One, an industry leader in crafting virtual humans for skilled video communications. The lifelike virtual characters, modeled exclusively after real people, convey human-like expressiveness through text, empowering businesses to raise their messaging with unmatched ease and scalability.
Could you share the genesis story behind Hour One?
The origin of Hour One could be traced back to my involvement within the crypto domain. Post that endeavor I started pondering what could be the following big thing that mass cloud compute can tap into and as machine learning was gaining popularity in recommendations and predictive analytics I used to be working on a number of ML infrastructure related projects. Through this work I got acquainted with early generative works and was especially excited by GANs at the moment. I used to be using all of the compute I could get my hands on to check those then-new technologies. When showing my results to a friend who had an organization in the sector he told me I need to meet Oren. Once I asked why, he told me that perhaps each of us will stop wasting his time and waste one another’s time. Oren, my co-founder and CEO of Hour One was an early investor in AI at the moment. and while we stood in other places we were each moving in the identical direction, and the founding of Hour One to be the Home of the Virtual Human was an inevitable journey.
What are a few of the machine learning algorithms which might be used, and what a part of the method is Generative AI?
Within the realm of video creation, machine learning algorithms are instrumental at every stage. On the scripting phase, Large Language Models (LLMs) offer invaluable support, crafting or refining content to make sure compelling narratives. As we move to audio, Text-to-Speech (TTS) algorithms morph text into organic, emotive voices. Transitioning to the visual representation, our proprietary Multimodal foundational model of the virtual human takes center stage. This model, enhanced with Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs), is adept at conveying contextual emotions, enunciation, and an articulated, fascinating, and authentic delivery. Such generative techniques turn text and audio cues into lifelike visuals of virtual humans, resulting in hyper-realistic video outputs. The orchestration of LLMs, TTS, GANs, VAEs, and our Multimodal model makes Generative AI not only an element however the backbone of recent video production.
How does Hour One differentiate itself from competing video generators?
At Hour One, our distinction from other video generators doesn’t stem from a preoccupation with competition, but reasonably from a deeply rooted philosophy governing our approach to quality, product design, and market strategy. Our tenet is to all the time prioritize the human element, ensuring our creations resonate with authenticity and emotion. We take pride in delivering the perfect quality within the industry without compromise. By utilizing advanced 3D video rendering, we offer our users with a real cinematic experience. Moreover, our strategy is uniquely opinionated; we start with a elegant product after which rapidly iterate towards perfection. This approach ensures that our offerings are all the time a step ahead, setting latest benchmarks in video generation.
Together with your extensive background in GPUs, are you able to share with us some insights in your views on NVIDIA Next-Generation GH200 Grace Hopper Superchip Platform?
The Grace Hopper architecture is really a game changer. If GPU can effectively work from its host’s RAM without completely bottlenecking the calculation, it unlocks currently not possible model/accelerator ratios in training, and because of this, much desired flexibility in training job sizes. Assuming the whole stock of GH200 is not going to be gulped by LLM training, we hope to make use of it to greatly reduce prototyping costs for our multi-modal architectures down the road.
Are there some other chips which might be currently in your radar?
Our most important goal is to offer the user with video content that’s price competitive. Given the demand for giant memory GPUs in the intervening time, we’re continuously optimizing and trying out any GPU cloud offering on the highest cloud service providers. Furthermore, we try to be no less than partially platform independent on a few of our workloads. Thus we’re eyeing TPUs and other ASICs, and likewise paying close attention to AMD. Eventually any hardware-led optimization route that may end up in higher FLOPs/$ ratio will probably be explored.
What’s your vision for future advancements in video generation?
In 24 months we cannot find a way to inform a generated human from a captured one. That may change plenty of things, and we’re here on the forefront of those advancements.
For the time being most generated videos are for computers and mobile devices, what needs to vary before we have now photo realistic generated avatars and worlds for each augmented reality and virtual reality?
As of now, we possess the potential to generate photo-realistic avatars and worlds for each augmented reality (AR) and virtual reality (VR). The first obstacle is latency. While the delivery of high-quality, real-time graphics to edge devices comparable to AR and VR headsets is significant, achieving this seamlessly is contingent upon several aspects. Foremost, we’re reliant on advancements in chip manufacturing to make sure faster and more efficient processing. Alongside this, optimizing power consumption is crucial to make sure longer usage without compromising the experience. Last but not least, we anticipate software breakthroughs that may efficiently bridge the gap between generation and real-time rendering. As these elements come together, we’ll see a surge within the utilization of photo-realistic avatars and environments across each AR and VR platforms.
What do you expect to be the following big breakthrough in AI?
In the case of the following significant breakthrough in AI, there’s all the time an air of pleasure and anticipation. While I’ve alluded to some advancements earlier, what I can share is that we’re actively working on several groundbreaking innovations at this very moment. I’d like to delve into specifics, but for now, I encourage everyone to keep watch over our upcoming releases. The long run of AI holds immense promise, and we’re thrilled to be on the forefront of those pioneering efforts. Stay tuned!
Is there anything that you want to to share about Hour One?
It is best to definitely take a look at our discord channel and API, latest additions to our platform offering at Hour One.