1. What is a Hash Function?
A hash function is a mathematical algorithm that converts input data of any size into a fixed-size output called a hash, digest, or checksum. Think of it as a fingerprint for data.
Example: SHA-256 Hash
Hello, World!
dffd6021bb2bd5b0af676290809ec3a53191dd81c7f70a4b28688a362182986f
No matter how large the input (a single character or a 10GB file), the output is always the same size. SHA-256 always produces 256 bits (64 hexadecimal characters).
Key Concepts:
- Hash/Digest: The fixed-size output of a hash function
- Collision: When two different inputs produce the same hash
- One-way function: Cannot compute input from output
- Deterministic: Same input always produces same output
2. Properties of Cryptographic Hash Functions
1. Deterministic
The same input always produces the exact same output. Hash "Hello" 1000 times → same result every time.
2. Quick Computation
Hash functions compute quickly. SHA-256 can process gigabytes per second on modern hardware.
3. Pre-image Resistance (One-way)
Given a hash, it's computationally infeasible to find the original input. You cannot "reverse" a hash.
4. Collision Resistance
It should be extremely difficult to find two different inputs that produce the same hash output.
5. Avalanche Effect
A tiny change in input causes a completely different output. "Hello" vs "hello" → entirely different hashes.
Avalanche Effect Demonstration:
Input: "Hello"
SHA-256: 185f8db32271fe25f561a6fc938b2e264306ec304eda518007d1764826381969
Input: "hello" (just lowercase 'h')
SHA-256: 2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824
// Completely different outputs from a single character change!
3. Algorithm Comparison Table
| Algorithm | Output Size | Security Status | Use Case |
|---|---|---|---|
| MD5 | 128 bits (32 hex) | ❌ Broken | Non-security checksums only |
| SHA-1 | 160 bits (40 hex) | ❌ Broken | Legacy systems only |
| SHA-256 | 256 bits (64 hex) | ✅ Secure | General purpose, blockchain |
| SHA-384 | 384 bits (96 hex) | ✅ Secure | Higher security needs |
| SHA-512 | 512 bits (128 hex) | ✅ Secure | Maximum security, 64-bit systems |
| SHA-3-256 | 256 bits (64 hex) | ✅ Secure | Alternative to SHA-2 family |
| BLAKE2 | Variable | ✅ Secure | Fast, modern alternative |
⚠️ Warning: MD5 and SHA-1 are Broken
Practical collision attacks exist for both MD5 and SHA-1. In 2017, Google demonstrated SHAttered - two different PDFs with the same SHA-1 hash. Never use these for security purposes.
Quick Recommendation:
- General security: SHA-256
- Passwords: bcrypt, Argon2 (NOT SHA-256 directly)
- Performance-critical: BLAKE2
- Non-security checksums: MD5 is acceptable
4. Hash Function Use Cases
1 Password Storage
Never store plaintext passwords. Hash them (with proper algorithms like bcrypt) so even if the database is breached, passwords aren't exposed.
2 File Integrity Verification
Verify downloaded files haven't been tampered with. Compare the hash of your download against the published hash.
3 Digital Signatures
Hash the document, then sign the hash. Efficient because you're signing a small fixed-size hash instead of a large document.
4 Data Deduplication
Use hashes to identify duplicate files. Same hash = same content (with extremely high probability).
5 Blockchain & Proof of Work
Bitcoin uses SHA-256 for block hashing and mining. The chain's security depends on hash properties.
6 HMAC (Message Authentication)
Hash-based Message Authentication Code combines a secret key with a hash to verify message integrity and authenticity.
7 Cache Keys & ETags
Generate cache keys from content hashes. If content changes, the hash changes, invalidating the cache.
8 Git Version Control
Git uses SHA-1 (migrating to SHA-256) to identify commits, files, and trees. Each object is addressed by its hash.
5. Password Hashing (bcrypt, Argon2, scrypt)
🔴 Never Use General Hash Functions for Passwords
MD5, SHA-1, SHA-256 are TOO FAST for password hashing. Attackers can try billions of guesses per second. Use dedicated password hashing algorithms that are intentionally slow.
Password Hashing Algorithm Comparison:
| Algorithm | Status | Memory-Hard | Best For |
|---|---|---|---|
| Argon2id | ✅ Recommended | Yes | New applications |
| bcrypt | ✅ Good | Limited | Widely supported |
| scrypt | ✅ Good | Yes | GPU resistance needed |
| PBKDF2 | ⚠️ Acceptable | No | Legacy/compliance |
bcrypt Example:
// Node.js with bcrypt
const bcrypt = require('bcrypt');
// Hash a password (cost factor 12 = 2^12 iterations)
async function hashPassword(password) {
const saltRounds = 12;
return await bcrypt.hash(password, saltRounds);
}
// Output: $2b$12$LQv3c1yqBWVHxkd0LHAkCOYz6TtxMQJqhN8/X4.FSdcqC.xTTBm/G
// Verify a password
async function verifyPassword(password, hash) {
return await bcrypt.compare(password, hash);
}
Argon2 Example (Python):
from argon2 import PasswordHasher
ph = PasswordHasher(
time_cost=3, # Number of iterations
memory_cost=65536, # 64MB of memory
parallelism=4 # 4 parallel threads
)
# Hash a password
hash = ph.hash("my_password")
# $argon2id$v=19$m=65536,t=3,p=4$...
# Verify a password
try:
ph.verify(hash, "my_password")
print("Password is correct!")
except:
print("Invalid password")
6. File Integrity Verification
Hash functions let you verify that files haven't been modified or corrupted during download or transfer.
How It Works:
- Developer calculates hash of original file
- Hash is published alongside the download
- User downloads file and calculates its hash
- User compares calculated hash with published hash
- Match = file is authentic and unmodified
Command Line Examples:
# Windows (PowerShell)
Get-FileHash -Algorithm SHA256 file.zip
# Linux/Mac
sha256sum file.zip
# Output:
# e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 file.zip
Code Examples:
// Node.js - Hash a file
const crypto = require('crypto');
const fs = require('fs');
function hashFile(filePath, algorithm = 'sha256') {
return new Promise((resolve, reject) => {
const hash = crypto.createHash(algorithm);
const stream = fs.createReadStream(filePath);
stream.on('data', data => hash.update(data));
stream.on('end', () => resolve(hash.digest('hex')));
stream.on('error', reject);
});
}
// Python
import hashlib
def hash_file(filepath, algorithm='sha256'):
hash_obj = hashlib.new(algorithm)
with open(filepath, 'rb') as f:
for chunk in iter(lambda: f.read(8192), b''):
hash_obj.update(chunk)
return hash_obj.hexdigest()
7. Salting and Peppering
Salt 🧂
A random value unique to each user, stored alongside the password hash in the database.
- Prevents rainbow table attacks
- Same password → different hashes
- Stored with the hash
Pepper 🌶️
A secret value applied to all passwords, stored separately (e.g., environment variable, HSM).
- Adds extra security layer
- Protects if DB is breached
- Never stored in database
Why Salting Matters:
// Without salt - identical passwords have identical hashes
hash("password123") → 5f4dcc3b5aa765d61d8327deb882cf99
hash("password123") → 5f4dcc3b5aa765d61d8327deb882cf99 // Same!
// With salt - identical passwords have different hashes
hash("password123" + "randomsalt1") → a1b2c3d4e5f6...
hash("password123" + "randomsalt2") → x7y8z9w0v1u2... // Different!
Rainbow Table Attack Prevention:
A rainbow table is a precomputed database of hash → password mappings. Without salt, attackers can look up hashes instantly. With unique salts, attackers would need a separate rainbow table for every possible salt – computationally infeasible.
8. Implementation Examples
JavaScript (Browser & Node.js)
// Browser - using Web Crypto API
async function sha256(message) {
const encoder = new TextEncoder();
const data = encoder.encode(message);
const hash = await crypto.subtle.digest('SHA-256', data);
return Array.from(new Uint8Array(hash))
.map(b => b.toString(16).padStart(2, '0'))
.join('');
}
// Node.js
const crypto = require('crypto');
function hash(algorithm, data) {
return crypto.createHash(algorithm).update(data).digest('hex');
}
console.log(hash('md5', 'Hello')); // 8b1a9953c4611296a827abf8c47804d7
console.log(hash('sha256', 'Hello')); // 185f8db32271fe25f561a6fc938b2e264...
// HMAC (for message authentication)
function hmac(algorithm, key, data) {
return crypto.createHmac(algorithm, key).update(data).digest('hex');
}
Python
import hashlib
# Basic hashing
def hash_string(algorithm: str, data: str) -> str:
return hashlib.new(algorithm, data.encode()).hexdigest()
print(hash_string('md5', 'Hello')) # 8b1a9953c4611296a827abf8c47804d7
print(hash_string('sha256', 'Hello')) # 185f8db32271fe25f561a6fc938b2e264...
# HMAC
import hmac
def create_hmac(key: str, message: str, algorithm='sha256') -> str:
return hmac.new(
key.encode(),
message.encode(),
algorithm
).hexdigest()
# Hash file
def hash_file(filepath: str, algorithm='sha256') -> str:
h = hashlib.new(algorithm)
with open(filepath, 'rb') as f:
for chunk in iter(lambda: f.read(8192), b''):
h.update(chunk)
return h.hexdigest()
Java
import java.security.MessageDigest;
import java.nio.charset.StandardCharsets;
public class HashExample {
public static String hash(String algorithm, String data) throws Exception {
MessageDigest digest = MessageDigest.getInstance(algorithm);
byte[] hashBytes = digest.digest(data.getBytes(StandardCharsets.UTF_8));
StringBuilder hex = new StringBuilder();
for (byte b : hashBytes) {
hex.append(String.format("%02x", b));
}
return hex.toString();
}
public static void main(String[] args) throws Exception {
System.out.println(hash("MD5", "Hello"));
System.out.println(hash("SHA-256", "Hello"));
}
}
PHP
<?php
// Basic hashing
echo hash('md5', 'Hello'); // 8b1a9953c4611296a827abf8c47804d7
echo hash('sha256', 'Hello'); // 185f8db32271fe25f561a6fc938b2e264...
// Password hashing (use this for passwords!)
$hash = password_hash('my_password', PASSWORD_BCRYPT, ['cost' => 12]);
// Or better:
$hash = password_hash('my_password', PASSWORD_ARGON2ID);
// Verify password
if (password_verify('my_password', $hash)) {
echo "Password is correct!";
}
// Hash file
echo hash_file('sha256', 'path/to/file.zip');
?>
9. Security Considerations
✅ Do: Use SHA-256 or better for security
SHA-256, SHA-384, SHA-512, SHA-3, and BLAKE2 are all secure choices for general hashing.
✅ Do: Use bcrypt/Argon2 for passwords
These are specifically designed for password hashing with built-in salting and configurable work factors.
✅ Do: Use HMAC for message authentication
HMAC combines a secret key with hashing to verify both integrity and authenticity.
❌ Don't: Use MD5 or SHA-1 for security
Both have practical collision attacks. Only use for non-security checksums where collision exploitation isn't possible.
❌ Don't: Hash passwords with SHA-256 directly
SHA-256 is too fast. Attackers can try billions of guesses per second. Use bcrypt/Argon2.
❌ Don't: Create your own hashing scheme
Hash(hash(password + salt) + pepper) - inventing schemes is dangerous. Use established libraries.
10. Frequently Asked Questions
What is the difference between hashing and encryption?
Encryption is two-way: encrypted data can be decrypted back to the original using a key. Output size varies with input. Used when you need to recover the original data.
Is MD5 completely useless now?
Can two different files have the same hash?
Why can't I reverse a hash to get the original data?
What is a rainbow table attack?
How long should a salt be?
What is the bcrypt cost factor?
Should I use SHA-256 or SHA-512?
What is HMAC and when should I use it?
How do I verify a downloaded file's hash?
2. Calculate the hash of your download:
• Windows:
Get-FileHash file.zip -Algorithm SHA256• Mac/Linux:
sha256sum file.zip3. Compare your calculated hash with the published hash.
4. If they match exactly, the file is authentic and unmodified.
Need to generate a hash?
Try our free hash generator. Create MD5, SHA-1, SHA-256, SHA-512 hashes instantly - all client-side, nothing sent to servers.
Open Hash Generator