users/fcuny/resume/resume.org


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40

#+TITLE: Franck Cuny
#+AUTHOR: [[mailto:franck@fcuny.net][franck@fcuny.net]]
#+OPTIONS: toc:nil num:nil title:nil timestamp:nil prop:nil

I'm a seasoned Site Reliability Engineer with experience in large scale distributed systems. I'm invested in mentoring junior and senior engineers to help them increase their impact. I'm always looking to learn from those around me.

*Specializations*: distributed systems, containerization, debugging, software development, reliability.

* Experience
** Roblox, San Mateo
| Site Reliability Engineer | Principal (IC6) | SRE Group | Feb 2022 - to date |
I'm the Team Lead for the Site Reliability group that was started at the end of 2021.

I'm defining the road-map and identify areas where SREs can partner with different team to improve overall reliability of our services.
** Twitter, San Francisco
*** Compute
| Software Engineer         | Senior Staff | Compute Info | Aug 2021 - Jan 2022 |
| Site Reliability Engineer | Senior Staff | Compute SREs | Jan 2018 - Aug 2021 |
Initially the Tech Lead of a team of 6 SREs supporting the Compute infrastructure. In August 2021 I changed to be a Software Engineer and was leading one of the effort to adopt Kubernetes for our on-premise infrastructure. As a Tech Lead I helped define number of internal processes for the team, from on-call rotations to postmortem processes.

Twitter's Compute is one of the largest Mesos cluster in the world (XXX thousands of nodes across multiple data centers). The team defined KPIs, improved automation to mange the large fleet of bare metal machines, defined APIs for maintenance with partner teams.

In addition to supporting Aurora/Mesos, I also lead a number of effort related to Kubernetes, both on-premise and in the cloud.

Finally, I've helped Twitter save XX of millions of dollar in hardware by designing and implementing strategies to significantly improve the hardware utilization of our bare metal infrastructure.
*** Storage
| Site Reliability Engineer | Staff | Storage SREs | Aug 2014 - Jan 2018 |
For 4 years I supported the Messaging and Manhattan teams. I moved all the pub-sub systems from bare-metal deployment to Aurora/Mesos, being the first storage team to adopt the Compute orchestration platform. This helped reducing operations, time to deploy, and improve overall reliability. I pushed for adopting 10Gb+ networking in our data center to help our team to scale. I was the SRE Tech Lead for the Manhattan team, helping with performance, operation and automation.
** Senior Software Engineer - Say Media, San Francisco
| Software Engineer | Senior SWE | Infrastructure | Aug 2011 - Aug 2014 |
During my time at Say Media, I worked on two different teams. I started as a software engineer in the platform team building the various APIs; I then transitioned to the operation team, to develop tooling to increase the effectiveness of the engineering organization.
** Senior Software Engineer - Linkfluence, Paris
| Software Engineer | Senior SWE | Infrastructure | July 2007 - July 2011 |
I was one of the early engineers joining Linkfluence in 2007. I led the development of the company's crawler (web, feeds). I was responsible for defining the early architecture of the company, and designed the internal platforms (Service Oriented Architecture).
I helped the company to contribute to open source projects; contributed to open source projects on behalf of the company; represented the company at numerous open sources conferences in Europe.
* Technical Skills
- *Languages*  Python, Go, Ruby, Perl
- *Frameworks* Kubernetes, Aurora, Mesos
- *Databases*  RDBMS, NOSql
- *Dev tools*  Git