#Start
It has been a long time since I did a side project. I was busy with my work, and moving from Hungary to the Netherlands takes a long time (especially with paperwork). With this out of the way, I started searching for new technology to learn. I want something that I would use on a daily basis. You know, I don't have a mobile application in my portfolio yet. I think it would be a good addition.
#Source for inspiration
I started a new Twitter account to follow people who are interested in tech. The people who would share their projects and how they made them. I want to keep up with the buzzing news in the AI community, to stay on the edge of it, as it would help a lot in my work. I saw one tweet about a new AI model called LLaVA made by haoutn-lu ,, which is a new LLM trained on top of the LLaMA2 model with more tweaks in recognizing images. There is more that got me in it when I saw the YouTube video from Jason Zhou . He was testing LLaVA by describing images on it . There were more hidden things under the surface of what LLaVA is actually capable of doing (I'm not saying that asking about "What to cook for 5 people?" is a bad question). From these two sources, I decided to start the journey of making my AI application, which is built around the LLaVA model. The application would take any image you have and make a story from it.
#What about my work?
One thing I did learn from my current work is to always plan your next move. When I get a new task, I have to make an RFC about it. The first one is a 25 (project definition), I lay out the concept of the project, what my feature will actually solve, and above all, who is the targeted customer from all this. Then comes 45 (User-Case), where I lay out the user interaction with the feature. If the user click here, the program will show this, or behave in this way. Lots of things would go in these files, and I didn't start even writing code yet. It was hard at the beginning to learn these things, but it was worth it.
I thought of my side project to follow these steps, but on a smaller scale.
#Javascript is good
I decided to go with react-native as the chosen language with using Expo library on the top. Expo will offer a template that will make the learning curve, not that steep to start with. in addition to having some pre-made components. My website is already using react, so the basics are already done. I went for youtube video about react-native animations and the navigation between components (or screens to talk mobile-wise) with a little of figma drawing as a proof of concept for the layout that I want for the app. I can tell that coming up with the layout and buttons navigation took a long time.
#Tinted royal blue
The application would need to have good animation and good colours. The functionality won't matter if the UX is bad. I got to the place where I spent one week searching for the colours that fit moods in the application. There is more into it, but I went with blue just because I like it.
#Functionalities
This is the part that I wanted to get to. I started with the repository (a private one, of course) and made the layout of the app. The app would have fixed functionalities:
- A good (interactive) welcome screen
- Pick an image and choose the genre of story you want to have.
- View the result in a separate window.
- Share the result in an SVG with your friends.
- Make money somehow from all of this.
#welcomeSlides.jsx
I had to take a good care with this one. If the user doesn't like it, or there is a small bug in it, then it means the user would delete the app. Most of the main stream apps have swiping slides with animation in the background that goes from one slide to the next, as a sign of continuity. The first thought I had was to make the app have animations about making the story. A thumb that moves from point on the screen to another place and a popup of the generated picture with the user sharing it at the end.
One the last version of it. I decided to go with a splash screen for the logo of Storify and make the welcome slides in three steps. The first slide will choose image, the second is for genre, and the last one is for the result. Which eliminates the number of interactions the user would need to do in order to see what the program is about.
#Pick image
The main screen where the user can choose their image. There are 5 genres a user can choose from (Action, Drama, Romance, Sci-fi and comedy). Each one of them has its own colour that makes it unique from the others. I had to test lots of views for it to see the best aspect ratio that would fit on different devices with different orientations.
#Loading animation
The problem I would have with the API of Replicate is the slow time to boot up the model and generate output. If the model hasn't been used for quite a while, it needs to go for a cold boot before generating, which might take up to 1 minute of loading. I wouldn't imagine the user looking at their screen for one minute straight, waiting for a small output. I overcame this problem by making my API make a request to the LLaVA model every 3 to 4 hours to keep it up and running. It was an automated script with Python. It would cost me money, but at this point, I cared more about the user experience than the money I would spend to get them. I had a nice loading animation that would pop-up after the promise of LLM is over.
#Viewing result
After story is generated, the user would be navigated automatically to the main screen, where they would see a card with the result of the story. It would have a linear gradient colour for the dominant colour of the image (all dynamic) and a very smooth animation to view the result. You can notice how the card's background is scaled to view on the result screen.
If this is the first time a user generates a story, they will see a screen hint to swipe up, for the full content of the story.
#Sharing a story
Inspiration for this part came from Spotify Year-Wrapp. I noticed they always put their name at the top, along with the name of the artist that got selected. There has to be a dynamic way to do it. I did some searching and saw they are using their accent colours to make it this way.
I took the inspiration from it and made (nearly) the same approach as SVG auto-generated. The only catch was making the image in a fixed ratio so it wouldn't overcome the text in it. I used a component to cut the image before processing the shared image. There is also the option to save the image on the mobile or share it through social media.
#Making money
I wanted the app to be as famous as it can without the restriction of paying to generate, but still, I had to make my backend run somehow. I had to follow the evil path of adding ads to the application. The best way to place it was while the story was being generated.
It was a good feeling to see that i made my first cent from the application.
#Publishing to play store
I thought that after finishing the application, I'd be good to go. I can say it wasn't like this at all. It turns out that Google Play Console has lots of pre-testing phases before you can publish your app to the store. The last phase was having 20 people in the closed testing for 14 days. I had to send it to friends and family to try the app. I thought it was useless as the app is done, but I got lots of feedback about things I didn't pay close attention to. Thanks to everyone who helped in testing the app.
I had an announcement email to inform people about their enrollment. I got more than 500 tester emails (not all of them joined. They thought it was an iOS application). A Python script plus an HTML page to serve as a template, and everything was good to go.
#Spreading the word
I had to go to various channels to send about the app. I went to the Discord channel from the same place I found testers. I got a good amount of them and, above all, more advice on how to add more features to the app. I started sending lots of emails to different tech channels, asking them for a review and their social media.
#Finally
I can say that the project went well after nearly 3 months of development and doing everything from start to finish. It had good momentum at the beginning. At the moment of writing this blog, the app has 120 downloads (in one week), with the first country being Egypt and the second being the USA. I'm glad for how the project went; see you in the next one.