FaaS for noobs

December 6, 2020 in AWS, FaaS

This is the first version of this article. Due to nuances, and things I forgot while writing its first version I will come back to it to fix stuff I got wrong or missed. If you have any comments, please reach out. Thank you….

FaaS

FaaS means Factorization as a Service and it is a name of a cool factorization framework relying on AWS EC2 cluster.

The framework code was released in 2015 and since then:

  • some patches were introduced to it, BUT
  • EC2 environment itself changed a lot – it makes it kinda hard for newcomers to start with FaaS as what awaits is a a serious troubleshooting session…

The below info tries to describe items of interest by focusing on a sequence of changes one has to introduce to the original FaaS config, and the hosting environment to make it work in 2020/2021.

Host OS
I created a new ‘main’ VM I wanted to use for this setup from the scratch, using Ubuntu 20.04 (ubuntu-20.04-desktop-amd64.iso). I then installed and updated python2 & python3 & libs as I went along (pip install …). I can’t describe all the changes here, but they are easy enough to spot. If your python code doesn’t work –> update python and the libraries. PIP command works very well and takes care of almost everything. Also, the good thing about this version of the Ubuntu system is that you really are in luck – Ubuntu 20.04 has almost everything you need + changes/adjustments required for FaaS are cosmetic in nature…

Amazon AWS

If you never used AWS, I want you to think of it as a place where you go to buy a server like you buy a beer. The original FaaS is focusing on buying that beer at the end of the party (Spot Instances) as opposed to buying anytime you want (On Demand). The difference in price is substantial – usually 10-11 times. Yes, seriously. And after a while you will notice that Spot Instances are hard to get sometimes (covid times!) and you may need to opt in for On Demand instances – these will cost you a proper dollar. If you want to drink anytime you want, you need to pay premium and ‘just in time’.

Another thing to remember is ‘where’ you buy that beer aka procure these instances. Bar hopping is fun, but… you must be VERY CAREFUL about procuring instances across regions. It’s extremely easy to start toying around with AWS across regions and get ‘unexpected’ bills at the end of the month. What is AWS region? It’s a place where you buy beer. It could be US, EMEA, APAC. And within these regions there are sub-regions that you need to explore.

If it doesn’t make sense… let’s start again. You want to lease a bunch of servers within a data center that is physically located in one of a few available places on Earth where Amazon hosts them. The bill for using each of these data centers comes to you separately. The moment you run/test/acquire some servers you owe money. Still not sure what I mean? Abandon this article. Or this will cost you money. Yes, go away and read more and come back when you are more comfy buying and paying for what you are using… There is no free AWS lunch.

You may think that it’s a very inflexible and ‘should be centralized’ pricing model, but… it’s your own responsibility. There is no easy way to manage it other than keeping some sort of logs of what region you started playing with. And yes, take it very seriously. These dollars add up very quickly and you don’t want to pay a huge bill for forgetting that you have spawn a few resources in other regions which you forgot to terminate. Note: I really really don’t blame Amazon/AWS. It’s you who procure and utilize resources. Always clean after yourself. You have been warned. It’s almost for granted that when you are new to AWS you will have to pay for bills in more than region. Yes, we all have to start somewhere.

At this stage YOU HAVE BEEN WARNED like 3 times? Continue reading at your own risk.

So… coming back to instance acquisition. Spot instance is a shared resource you have to bid for, and the OnDemand instance is something you acquire when you want and are eager to pay. How you choose Spot vs OnDemand instances from FaaS level? Read on.

First things first.

In both cases you DO need to raise Support tickets to AWS to request larger number of instances to be available to you. When you sign up to AWS for the very first time you are just a nobody and you are not trusted by default. And yes, they won’t raise these quotas w/o a proper justification, so be prepared to answer the questions they ask in the most honest/precise fashion. My experience is that AWS is pretty quick in replying and you get answers within 24-48h. They rarely give you what you ask for, that’s why you should ask for more than you need, by default.

After raising the tickets this is what I got:

Requested higher number of Spot Instances  --> 450+
Requested higher number of Instances --> 450+

Mind you – this is just for ONE region. If you plan to use instances in other regions you need to raise separate tickets!

Still with me?

Yes, it costs money.

Yes, it is pretty complicated.

FaaS build

I don’t have a very intricate knowledge of how FaaS works. It sounds absurd, but it’s true. I have read many files of this project and kinda ‘get it’, but I don’t know everything, and lots of terms I was introduced to while reading these files were new to me. I am not kidding. I was learning as I was going along.

So… my naive perception of how things work is as follows: FaaS builds a ‘master’ image where all the calculations are scheduled from, and where all the results are collected. It also build ‘slave’ images that do the actual work. The latter is built via Amazon Machine Images (AMI). (I know this section needs to be extended to include more info on AMI and MPIs.)

One of the things you do when you use FaaS is building that AMI image. During the build you will see a fail to old python version (it will say that ‘remote version of python is too old’), so have to ssh to that instance and update python on it & restart the process:

YML scripts

All of the FaaS scripts are using old notation!!!

email: {{}}
N: {{}}

so need to change it to new with quotes and ticks

email: "{{}}"
N: '<number>'

also, they use old notation for elevated shell and they refer to ‘sudo’; in newer Ansible you use ‘become’ i.e.:

sudo: yes|no

should become :

become: yes|no

EC2 folder:

added Debug section to main.yml -- not affecting anything, just listing detailed info which helps with trobleshooting

    - name: Debug

      debug:

        msg: "{{ ec2 }}"

in some instances had to enforce python3

ec2/build-finish.yml &
ec2/roles/build/tasks/install-msieve.yml

vars:

ansible_python_interpreter: /usr/bin/python3

also added install of full python3

ec2/roles/build/tasks/install-common.yml

changes:

  • apt: name=python3
    become: yes

ec2/roles/factor/templates/post_linalg.j2 & ec2/roles/factor/templates/post_sieve.j2

changes:

disabled termination of instances, JIC

ec2/roles/launch/tasks/main.yml

added:

wait: true 

instance_initiated_shutdown_behavior: terminate

ec2/vars/launch.yml

adjusted number of instances & changed them to

type: c4.8xlarge
cores: 36

And this is it pretty much it.

Sounds complicated?

Yes, it is. It should be. It took me 2-3 days of troubleshooting to make it finally work and I must honestly admit that still don’t know how all the parts work together 100%, but the exercise was worth the effort. Not only I was introduced to AWS and EC2 clusters, I actually ran a distributed calculation – something that a few years ago would not even be possible. The possibilities of ‘rent what you need’ cannot be under or overstated – it’s a completely different world than 20 years ago. Having an ability to launch a parallel computing task w/o being a privileged scientist, large corporate, or government still blows my mind. I mean… If it is all about CPU cycles, then you can just acquire it and go with it.

Comments are closed.