Property-Preserving Encryption & ORAM

Topics PPE · Equality/Order Preserving · Inference Attacks · Oblivious RAM
01 //

How Do Data Breaches Happen?

Data flows through applications to users. Service providers (e.g. EMR systems, online dating) hold sensitive data. An adversary may compromise the provider. Encryption protects data, but can we protect data and let the application work (e.g. search, sort)?

Encryption Types
Type Characteristic
Homomorphic Computation on encrypted data matches plaintext result
Property Preserving Encrypted data in same order as plaintext
Searchable Search using encrypted keywords
Secure Computation Parties compute with private inputs
Functional Secret key reveals function being computed
02 //

Property-Preserving Encryption (PPE)

PPE is widely deployed (e.g. Microsoft Always Encrypted). It allows SQL queries on encrypted columns without decrypting on the server.

Why PPE Is Popular
Deployability

No change to application or database servers

Expressiveness

Supports most common SQL queries (equality, range, ORDER BY)

Efficiency

~25% overhead

Standard vs Property-Preserving

Standard Encryption

Leaks nothing except size. Cannot perform equality or range queries on ciphertext.

Equality-Preserving

Preserves equality. Leaks equality and frequency - same plaintext → same ciphertext; attacker sees frequency histogram.

Order-Preserving

Preserves order. Leaks order and frequency - plaintext order = ciphertext order.

Security Tradeoff

PPE trades security for functionality. Leaked information enables inference attacks.

03 //

PPE Leakage in Practice

Electronic Medical Records (EMR) study: 200 hospitals, 12,975 to 121,664 patients. Attributes include sex, race, age, mortality risk, disease severity, length of stay, etc. Inference attack: encrypted data + auxiliary data (e.g. public demographics) → recover plaintext.

Sensitive Attributes

Patient-sensitive: Sex, race, age, admission month, payer, disease, etc. Hospital-sensitive: Patient died, length of stay, mortality risk, disease severity, diagnostic category.

04 //

Attacks on PPE

Equality-Preserving: Frequency Analysis

Match ciphertext frequency histogram to auxiliary plaintext histogram. Ciphertext with frequency 10 maps to plaintext with frequency 11 (closest match). Classical technique (e.g. Al-Kindi 9th century).

Equality-Preserving: lp-Optimization Attack (CCS 2015)

Naveed, Kamara, Wright

Find assignment from ciphertext to plaintext that minimizes cost (e.g. histogram distance). Reduces to Linear Sum Assignment Problem (LSAP) - solved in O(n³) with Hungarian algorithm. Better recovery than simple frequency matching.

Equality-Preserving Attack Results (200 hospitals)

Sex: 100% patients, 95% hospitals. Patient Died: 100% patients, 100% hospitals. Mortality Risk: 100% patients, 99% hospitals. Age: 10% patients, 85% hospitals. Length of Stay: 83% patients, 50% hospitals. Major Diagnostic Category: 40% patients, 28% hospitals.

Order-Preserving: Cumulative Attack

Exploits Order + Frequency

Uses empirical CDF (cumulative distribution). Ciphertext greater than 90% of values → match to plaintext greater than ~90% of auxiliary data. Combines order and frequency; solves LSAP for best mapping. More effective than sorting attack alone.

Order-Preserving Attack Results

Disease Severity, Admission Month, Mortality Risk, Length of Stay: 100% patients, 100% hospitals. Age: 83% patients, 99% hospitals. Admission Type: 79% patients, 100% hospitals.

Attack Recap

Equality-preserving: frequency-based attacks. Order-preserving: cumulative attack (order + frequency). Both rely on auxiliary data (public histograms, demographics). Confidentiality of many EMR attributes is compromised.

05 //

Oblivious RAM (ORAM)

Even with encryption, access patterns leak information (which files, when). If we don't trust the cloud, we need to hide access patterns.

Obliviousness

For any fixed-size request sequence, the storage accesses observed by the cloud are statistically independent of the actual requests. Cloud cannot infer what data is being accessed.

ORAM Techniques

ORAM Flow

Application: get(key1), put(key1, val1), get(key2). ORAM Client translates to multiple download(object32), upload(object65, data19), etc. Each logical access → multiple cloud accesses; both read and write per access.

ORAM Systems

System Approach
PathORAM Tree-based (CCS 2013)
LayeredORAM Layered (CCSW 2011)
PracticalOS Large messages (CODASPY 2012)
ObliviStore Partition-based (S&P 2013)
ORAM Requirements

Client must have private randomness for dummy access patterns. Data must be encrypted. Each remote storage access typically includes both read and write to hide whether the application read or wrote.

06 //

Summary

Property-Preserving Encryption & ORAM - Takeaways
  • PPE - equality/order preserving; enables SQL on encrypted data; leaks frequency and/or order
  • Inference attacks - encrypted + auxiliary data → plaintext recovery; lp-optimization, cumulative attack
  • EMR study - many attributes recoverable; 100% for some (Patient Died, Disease Severity, etc.)
  • ORAM - hides access patterns; fixed blocks, encryption, dummy accesses; tradeoff: overhead

Further Reading

Naveed, Kamara, Wright - lp-optimization and cumulative attacks on equality- and order-preserving encryption.