Node.js monolith serves:
- Customer app
- Driver app
- Warehouse app
- Internal site
Problem 1: items being reacquired
Items being reacquired
- Driver acquires item
- Item transferred to warehouse
- Item marked as shipped
- Driver reacquires item
Lots of confusion
- Ghost items re-appear in driver inventory
- User apps show funky UI
- Not clear how to resolve
Figuring out the valid states can be really hard
Really hard
- You need to get every call site
- You might break every call site!
- Clients might send a
{state: 'X'}
- You don't know all of the transitions
Problem 2: duplicate submissions
Duplicate submissions
- T1: Submit pickup A
- T2: Submit pickup A
- T1:
Items.findOne(A)
- T2:
Items.findOne(A)
- T1: Check
STATES['SUBMITTED']
- T2: Check
STATES['SUBMITTED']
Duplicate submissions (cont'd)
- T1: item.state = 'SUBMITTED'
- T2: item.state = 'SUBMITTED'
- T1: item.save()
- T2: item.save()
- T1: *Attempt to find a driver*
- T2: *Attempt to find a driver*
Two drivers, same pickup
You can run that 1000 times
and only one will succeed*
*Your database may not support CAS
CAS: a fancy way of
saying UPDATE with WHERE
"jepsen compare and set [your database]"
Lesson: if you type "state machine" into NPM you are doing it wrong
Lesson: read-check-modify-save is incorrect/vulnerable
Just try the write (cont'd)
Problem 3: Partial Assignment
Partial assignment (cont'd)
"Driver X is in ASSIGNED but doesn't
have any pickups."
Our #1 cause of
oncall incidents
*Your ORM may not support transactions
*Your database may not support transactions
Transactions can go wrong
- Can deadlock
(especially if connection pool is full!)
- Can degrade MVCC performance
- Missing a code path hurts big time
- Conn failure on rollback/commit
Conclusions
- Use state machines from the start
- (Use a WHERE clause with UPDATE)
- No read-modify-save
- Pick an ORM with transactions
Consistency is hard
Thanks!
Kevin Burke
These slides are available at:
←
→
/
#