Running LLaMA and Gemma LLMs with C++ and Python
Nowadays, “mobile AI” is a fast-growing trend. Smartphones turn out to be more powerful, and enormous models turn out to be more efficient. Some customers will probably want to wait until latest features are added by phone manufacturers, but can we use the newest AI models on our own? Indeed, we will, and the outcomes are fun. In this text, I’ll show tips on how to run LLaMA and Gemma large language models on an Android phone, and we’ll see how it really works. As usual in all my tests, all models will run locally, and no cloud APIs or payments are needed.
Let’s get into it!
Termux
The primary component of our test is Termux, a full-fledged Linux terminal made as an Android application. It’s free, and it doesn’t require root access; all Linux components are running exclusively in a Termux folder. Termux will be downloaded from Google Play, but on the time of writing this text, that version was pretty old, and the “pkg update” command in Termux didn’t work anymore. A more recent version is offered as an APK on the F-Droid website; it really works well, and I had no problems with it.
When Termux is installed on the phone, we will run it and see an ordinary Linux command-line interface:
In theory, we will enter all commands directly on the phone, but typing on the tiny keyboard is inconvenient. A significantly better way is to put in SSH; this will be done by utilizing “pkg install”:
pkg update
pkg upgrade
pkg install openssh
After that, we will start the SSH daemon in Termux by running the sshd
command. We also must get the user name and set the SSH password:
sshd
whoami
#> u0_a461
passwd
#> Enter latest password
...
Now, we will hook up with a phone with any SSH client:
ssh -p 8022 u0_a461@192.168.100.101
Here, 8022 is a default Termux SSH port, “u0_a461” is a user name that we get from a “whoami” command, and “192.168.100.101” is the IP…