DICOM Reference

This appendix provides a comprehensive reference for working with DICOM (Digital Imaging and Communications in Medicine) files in Python. While Chapter 8 introduces the essential concepts, this appendix covers the technical details needed for real-world medical imaging pipelines.

DICOM File Structure

A DICOM file consists of two main components:

  1. Header (Metadata): A structured collection of data elements, each identified by a tag
  2. Pixel Data: The actual image intensities, stored as the last data element

Data Elements and Tags

Each data element has:

  • Tag: A pair of hexadecimal numbers (group, element) identifying the attribute
  • VR (Value Representation): The data type (e.g., LO for Long String, US for Unsigned Short)
  • Length: Number of bytes in the value
  • Value: The actual data
import pydicom

dcm = pydicom.dcmread("image.dcm")

# Access elements by tag
patient_name = dcm[0x0010, 0x0010]  # Using hex tag
print(f"Tag: {patient_name.tag}")
print(f"VR: {patient_name.VR}")
print(f"Value: {patient_name.value}")

# Or by keyword (more readable)
print(dcm.PatientName)
print(dcm.Modality)

Common DICOM Tags

Understanding key tags is essential for correct image processing:

Patient Information

Tag Keyword Description
(0010,0010) PatientName Patient’s full name
(0010,0020) PatientID Hospital patient identifier
(0010,0030) PatientBirthDate Date of birth (YYYYMMDD)
(0010,0040) PatientSex M, F, or O

Study & Series

Tag Keyword Description
(0008,0020) StudyDate Date study was performed
(0008,0060) Modality CT, MR, CR, DX, US, etc.
(0008,103E) SeriesDescription Free-text series description
(0020,000D) StudyInstanceUID Unique study identifier
(0020,000E) SeriesInstanceUID Unique series identifier

Image Geometry

Tag Keyword Description
(0028,0010) Rows Image height in pixels
(0028,0011) Columns Image width in pixels
(0028,0030) PixelSpacing Physical size of pixels [row, column] in mm
(0018,0050) SliceThickness Thickness of slice in mm
(0020,0032) ImagePositionPatient [x, y, z] position of top-left pixel
(0020,0037) ImageOrientationPatient Direction cosines for row and column
(0018,0088) SpacingBetweenSlices Distance between adjacent slices

Pixel Data

Tag Keyword Description
(0028,0100) BitsAllocated Bits per pixel (8, 12, 16)
(0028,0101) BitsStored Bits actually used
(0028,0102) HighBit Most significant bit position
(0028,0103) PixelRepresentation 0 = unsigned, 1 = signed
(0028,1052) RescaleIntercept Value b in: output = m*pixel + b
(0028,1053) RescaleSlope Value m in: output = m*pixel + b
(7FE0,0010) PixelData The actual image data

Reading DICOM Files

Single Files

import pydicom
import numpy as np

# Basic read
dcm = pydicom.dcmread("image.dcm")

# Read without loading pixel data (faster for metadata-only operations)
dcm = pydicom.dcmread("image.dcm", stop_before_pixels=True)

# Force reading despite missing elements
dcm = pydicom.dcmread("image.dcm", force=True)

Accessing Pixel Data

# Get pixel array as numpy
pixels = dcm.pixel_array

# For CT: apply rescale slope and intercept to get Hounsfield units
def get_hu_image(dcm):
    """Convert stored pixel values to Hounsfield units."""
    pixels = dcm.pixel_array.astype(np.float32)

    # Apply rescale
    slope = getattr(dcm, 'RescaleSlope', 1)
    intercept = getattr(dcm, 'RescaleIntercept', 0)

    hu = pixels * slope + intercept
    return hu

hu_image = get_hu_image(dcm)
print(f"HU range: {hu_image.min():.0f} to {hu_image.max():.0f}")

Handling Photometric Interpretation

X-rays can be stored as MONOCHROME1 (high values = dark) or MONOCHROME2 (high values = bright):

def normalize_photometric(dcm):
    """Ensure consistent display: high values = bright."""
    pixels = dcm.pixel_array.astype(np.float32)

    if dcm.PhotometricInterpretation == "MONOCHROME1":
        # Invert so high values are bright
        pixels = pixels.max() - pixels

    return pixels

CT Series and 3D Volumes

A CT scan consists of multiple DICOM files (one per slice). Loading them correctly requires:

  1. Finding all files in the series
  2. Sorting by slice position
  3. Stacking into a 3D array
  4. Computing correct voxel spacing

Loading a CT Volume

import os
from pathlib import Path
import pydicom
import numpy as np

def load_ct_volume(directory):
    """Load CT series from directory into 3D numpy array."""

    # Find all DICOM files
    dicom_files = []
    for f in Path(directory).iterdir():
        if f.is_file():
            try:
                dcm = pydicom.dcmread(f, stop_before_pixels=True)
                dicom_files.append((f, dcm))
            except:
                continue  # Skip non-DICOM files

    if not dicom_files:
        raise ValueError(f"No DICOM files found in {directory}")

    # Sort by ImagePositionPatient (z-coordinate)
    dicom_files.sort(key=lambda x: float(x[1].ImagePositionPatient[2]))

    # Load all slices
    slices = []
    for filepath, _ in dicom_files:
        dcm = pydicom.dcmread(filepath)
        # Convert to Hounsfield units
        pixels = dcm.pixel_array.astype(np.float32)
        slope = getattr(dcm, 'RescaleSlope', 1)
        intercept = getattr(dcm, 'RescaleIntercept', 0)
        hu = pixels * slope + intercept
        slices.append(hu)

    # Stack into 3D volume
    volume = np.stack(slices, axis=0)  # Shape: [depth, height, width]

    # Get voxel spacing
    ref_dcm = pydicom.dcmread(dicom_files[0][0])
    pixel_spacing = ref_dcm.PixelSpacing  # [row_spacing, col_spacing] in mm

    # Calculate slice spacing from first two slices
    if len(dicom_files) > 1:
        pos1 = pydicom.dcmread(dicom_files[0][0]).ImagePositionPatient
        pos2 = pydicom.dcmread(dicom_files[1][0]).ImagePositionPatient
        slice_spacing = abs(pos2[2] - pos1[2])
    else:
        slice_spacing = getattr(ref_dcm, 'SliceThickness', 1.0)

    spacing = np.array([slice_spacing, float(pixel_spacing[0]), float(pixel_spacing[1])])

    return volume, spacing

# Usage
volume, spacing = load_ct_volume("path/to/ct_series/")
print(f"Volume shape: {volume.shape}")
print(f"Voxel spacing (mm): {spacing}")

Coordinate Systems

Medical imaging uses specific coordinate conventions that differ from image array indices. Understanding these is critical for correct spatial operations.

Patient Coordinate System (LPS/RAS)

DICOM uses the LPS (Left-Posterior-Superior) coordinate system:

  • X-axis: Points to patient’s Left
  • Y-axis: Points to patient’s Posterior (back)
  • Z-axis: Points to patient’s Superior (head)

Neuroimaging often uses RAS (Right-Anterior-Superior)—the axes are flipped.

Image Position and Orientation

Two tags define how pixel indices map to patient coordinates:

  • ImagePositionPatient (0020,0032): The (x, y, z) coordinates of the top-left pixel
  • ImageOrientationPatient (0020,0037): Six values defining the direction cosines
def pixel_to_patient(dcm, row, col):
    """Convert pixel indices to patient coordinates (mm)."""

    # Image position (top-left corner of first pixel)
    S = np.array(dcm.ImagePositionPatient)

    # Direction cosines
    orientation = dcm.ImageOrientationPatient
    row_cosine = np.array(orientation[:3])  # Direction of rows
    col_cosine = np.array(orientation[3:])  # Direction of columns

    # Pixel spacing
    spacing = dcm.PixelSpacing  # [row_spacing, col_spacing]

    # Calculate patient coordinate
    patient_coord = (
        S +
        col * spacing[1] * row_cosine +  # Column moves along row direction
        row * spacing[0] * col_cosine     # Row moves along column direction
    )

    return patient_coord

# Example: find patient coordinates of pixel (100, 150)
coord = pixel_to_patient(dcm, row=100, col=150)
print(f"Patient position: {coord} mm")

Affine Transformation Matrix

The complete mapping from voxel indices to patient coordinates can be expressed as a 4×4 affine matrix:

def get_affine_matrix(dcm):
    """Build 4x4 affine matrix from DICOM metadata."""

    # For a 3D volume, we need slice information too
    orientation = dcm.ImageOrientationPatient
    position = dcm.ImagePositionPatient
    spacing = dcm.PixelSpacing

    row_cosine = np.array(orientation[:3])
    col_cosine = np.array(orientation[3:])

    # Slice direction (cross product of row and column cosines)
    slice_cosine = np.cross(row_cosine, col_cosine)

    # For single slice, use SliceThickness; for volume, compute from positions
    slice_spacing = getattr(dcm, 'SliceThickness', 1.0)

    # Build affine matrix
    affine = np.eye(4)
    affine[:3, 0] = row_cosine * spacing[1]   # Column direction
    affine[:3, 1] = col_cosine * spacing[0]   # Row direction
    affine[:3, 2] = slice_cosine * slice_spacing  # Slice direction
    affine[:3, 3] = position  # Origin

    return affine

Format Conversion

DICOM to NIfTI

NIfTI is the standard format for neuroimaging research. It stores the volume and affine matrix in a single file:

import nibabel as nib
import numpy as np

def dicom_to_nifti(dicom_dir, output_path):
    """Convert DICOM series to NIfTI format."""

    # Load using SimpleITK
    import SimpleITK as sitk

    reader = sitk.ImageSeriesReader()
    series_ids = reader.GetGDCMSeriesIDs(dicom_dir)
    file_names = reader.GetGDCMSeriesFileNames(dicom_dir, series_ids[0])
    reader.SetFileNames(file_names)
    image = reader.Execute()

    # Get data and metadata
    volume = sitk.GetArrayFromImage(image)  # [z, y, x]
    spacing = image.GetSpacing()            # [x, y, z]
    origin = image.GetOrigin()
    direction = np.array(image.GetDirection()).reshape(3, 3)

    # Build affine matrix
    affine = np.eye(4)
    affine[:3, :3] = direction @ np.diag(spacing)
    affine[:3, 3] = origin

    # Transpose volume to NIfTI convention [x, y, z]
    volume = np.transpose(volume, (2, 1, 0))

    # Create and save NIfTI
    nifti_img = nib.Nifti1Image(volume, affine)
    nib.save(nifti_img, output_path)

dicom_to_nifti("path/to/ct_series/", "output.nii.gz")

DICOM to PNG/JPEG

For visualization or training 2D models:

from PIL import Image
import numpy as np

def dicom_to_png(dcm_path, output_path, window_center=None, window_width=None):
    """Convert single DICOM to PNG with optional windowing."""

    dcm = pydicom.dcmread(dcm_path)
    pixels = dcm.pixel_array.astype(np.float32)

    # Apply rescale for CT
    slope = getattr(dcm, 'RescaleSlope', 1)
    intercept = getattr(dcm, 'RescaleIntercept', 0)
    pixels = pixels * slope + intercept

    # Apply window/level if specified
    if window_center is not None and window_width is not None:
        lower = window_center - window_width / 2
        upper = window_center + window_width / 2
        pixels = np.clip(pixels, lower, upper)
        pixels = (pixels - lower) / (upper - lower) * 255
    else:
        # Simple min-max normalization
        pixels = (pixels - pixels.min()) / (pixels.max() - pixels.min()) * 255

    # Convert to 8-bit and save
    pixels = pixels.astype(np.uint8)
    Image.fromarray(pixels).save(output_path)

# Chest X-ray (no windowing needed)
dicom_to_png("chest_xray.dcm", "chest_xray.png")

# CT with lung window
dicom_to_png("ct_slice.dcm", "ct_lung.png", window_center=-600, window_width=1500)

Multi-Frame and Enhanced DICOM

Modern scanners may store entire volumes in a single “enhanced” DICOM file with multiple frames.

Detecting Multi-Frame Files

dcm = pydicom.dcmread("enhanced_ct.dcm")

# Check for multi-frame
num_frames = getattr(dcm, 'NumberOfFrames', 1)
print(f"Number of frames: {num_frames}")

if num_frames > 1:
    # Pixel array will be 3D: [frames, rows, cols]
    volume = dcm.pixel_array
    print(f"Volume shape: {volume.shape}")

Enhanced CT/MR

Enhanced DICOM organizes per-frame metadata differently:

def read_enhanced_ct(dcm_path):
    """Read enhanced (multi-frame) CT DICOM."""
    dcm = pydicom.dcmread(dcm_path)

    # Get pixel data
    volume = dcm.pixel_array.astype(np.float32)

    # Per-frame functional groups contain frame-specific metadata
    if hasattr(dcm, 'PerFrameFunctionalGroupsSequence'):
        # Get slice positions
        positions = []
        for frame_group in dcm.PerFrameFunctionalGroupsSequence:
            plane_pos = frame_group.PlanePositionSequence[0]
            positions.append(plane_pos.ImagePositionPatient[2])

        # Sort frames by position
        sort_order = np.argsort(positions)
        volume = volume[sort_order]

    # Rescale values
    if hasattr(dcm, 'SharedFunctionalGroupsSequence'):
        shared = dcm.SharedFunctionalGroupsSequence[0]
        if hasattr(shared, 'PixelValueTransformationSequence'):
            transform = shared.PixelValueTransformationSequence[0]
            slope = transform.RescaleSlope
            intercept = transform.RescaleIntercept
            volume = volume * slope + intercept

    return volume

RT Structure Sets (Contours)

Radiation therapy planning uses DICOM RT Structure Sets to store organ and tumor contours.

Reading RT Structures

def read_rtstruct(rtstruct_path):
    """Read contours from DICOM RT Structure Set."""
    rs = pydicom.dcmread(rtstruct_path)

    structures = {}

    # Iterate through structure set ROI sequence
    for roi in rs.StructureSetROISequence:
        roi_number = roi.ROINumber
        roi_name = roi.ROIName

        structures[roi_number] = {
            'name': roi_name,
            'contours': []
        }

    # Get contour data
    for roi_contour in rs.ROIContourSequence:
        roi_number = roi_contour.ReferencedROINumber

        if hasattr(roi_contour, 'ContourSequence'):
            for contour in roi_contour.ContourSequence:
                # Contour data is a flat list: x1,y1,z1,x2,y2,z2,...
                data = contour.ContourData
                points = np.array(data).reshape(-1, 3)
                structures[roi_number]['contours'].append(points)

    return structures

structures = read_rtstruct("rtstruct.dcm")
for num, struct in structures.items():
    print(f"ROI {num}: {struct['name']} - {len(struct['contours'])} contours")

Converting Contours to Masks

from skimage.draw import polygon
import numpy as np

def contour_to_mask(contour_points, ct_image, slice_position):
    """Convert contour points to binary mask for a single slice."""

    # Get image geometry
    spacing = ct_image.GetSpacing()
    origin = ct_image.GetOrigin()
    direction = np.array(ct_image.GetDirection()).reshape(3, 3)
    size = ct_image.GetSize()

    # Filter points for this slice (within tolerance)
    z_tolerance = spacing[2] / 2
    slice_points = contour_points[
        np.abs(contour_points[:, 2] - slice_position) < z_tolerance
    ]

    if len(slice_points) == 0:
        return np.zeros((size[1], size[0]), dtype=np.uint8)

    # Convert physical coordinates to pixel indices
    physical_to_pixel = np.linalg.inv(direction @ np.diag(spacing))
    pixel_coords = (slice_points[:, :2] - origin[:2]) @ physical_to_pixel[:2, :2].T

    # Create mask
    mask = np.zeros((size[1], size[0]), dtype=np.uint8)
    rr, cc = polygon(pixel_coords[:, 1], pixel_coords[:, 0], mask.shape)
    mask[rr, cc] = 1

    return mask

Using rt-utils Library

For easier RT structure handling:

from rt_utils import RTStructBuilder

# Load existing RT struct
rtstruct = RTStructBuilder.create_from(
    dicom_series_path="path/to/ct_series/",
    rt_struct_path="rtstruct.dcm"
)

# Get list of ROI names
roi_names = rtstruct.get_roi_names()
print(f"Available structures: {roi_names}")

# Get mask for specific structure
mask = rtstruct.get_roi_mask_by_name("GTV")
print(f"Mask shape: {mask.shape}")

De-Identification

Before sharing medical images, protected health information (PHI) must be removed.

Basic De-Identification

def deidentify_dicom(dcm_path, output_path, new_patient_id="ANON"):
    """Remove PHI from DICOM file."""
    dcm = pydicom.dcmread(dcm_path)

    # Tags containing PHI
    phi_tags = [
        'PatientName',
        'PatientID',
        'PatientBirthDate',
        'PatientAddress',
        'PatientTelephoneNumbers',
        'InstitutionName',
        'InstitutionAddress',
        'ReferringPhysicianName',
        'PhysiciansOfRecord',
        'OperatorsName',
        'StudyDate',
        'SeriesDate',
        'AcquisitionDate',
        'ContentDate',
        'StudyTime',
        'SeriesTime',
        'AcquisitionTime',
        'ContentTime',
    ]

    for tag in phi_tags:
        if hasattr(dcm, tag):
            if 'Date' in tag:
                setattr(dcm, tag, '19000101')
            elif 'Time' in tag:
                setattr(dcm, tag, '000000')
            else:
                setattr(dcm, tag, new_patient_id if 'Patient' in tag else 'REMOVED')

    # Generate new UIDs
    import uuid
    dcm.StudyInstanceUID = pydicom.uid.generate_uid()
    dcm.SeriesInstanceUID = pydicom.uid.generate_uid()
    dcm.SOPInstanceUID = pydicom.uid.generate_uid()

    dcm.save_as(output_path)

deidentify_dicom("original.dcm", "anonymous.dcm", "STUDY001")

Removing Burned-In PHI

Some images have PHI burned into pixel data (visible text). This requires image processing:

def remove_burned_in_phi(dcm, regions_to_black):
    """Black out regions with burned-in text.

    regions_to_black: list of (top, left, height, width) tuples
    """
    pixels = dcm.pixel_array.copy()

    for top, left, height, width in regions_to_black:
        pixels[top:top+height, left:left+width] = 0

    dcm.PixelData = pixels.tobytes()
    return dcm

Using Established Tools

For production de-identification, use validated tools:

  • DICOM Anonymizer (deid): https://pydicom.github.io/deid/
  • DicomCleaner: Part of PixelMed toolkit
  • CTP (Clinical Trial Processor): RSNA’s standard anonymization tool
# Using deid library
from deid.config import DeidRecipe
from deid.dicom import replace_identifiers

recipe = DeidRecipe()
recipe.load("deid-recipe.txt")  # Custom anonymization rules

replace_identifiers(
    dicom_files=["image.dcm"],
    output_folder="anonymized/",
    recipe=recipe
)

Transfer Syntax and Compression

DICOM files can use various compression schemes, affecting how pixel data is stored.

Common Transfer Syntaxes

Transfer Syntax UID Name Compression
1.2.840.10008.1.2 Implicit VR Little Endian None
1.2.840.10008.1.2.1 Explicit VR Little Endian None
1.2.840.10008.1.2.4.50 JPEG Baseline Lossy
1.2.840.10008.1.2.4.70 JPEG Lossless Lossless
1.2.840.10008.1.2.4.90 JPEG 2000 Lossless Lossless
1.2.840.10008.1.2.5 RLE Lossless Lossless

Handling Compressed Files

import pydicom
from pydicom.pixel_data_handlers import apply_rescale

# pydicom handles most decompression automatically if you have:
# - pylibjpeg (for JPEG)
# - pillow (for some formats)
# - gdcm (comprehensive but complex to install)

dcm = pydicom.dcmread("compressed.dcm")

# Check transfer syntax
print(f"Transfer Syntax: {dcm.file_meta.TransferSyntaxUID}")

# Access pixel data (decompresses automatically)
pixels = dcm.pixel_array

Converting Transfer Syntax

def decompress_dicom(input_path, output_path):
    """Convert to uncompressed DICOM."""
    dcm = pydicom.dcmread(input_path)

    # Decompress pixel data
    dcm.decompress()

    # Change transfer syntax to explicit VR little endian
    dcm.file_meta.TransferSyntaxUID = pydicom.uid.ExplicitVRLittleEndian

    dcm.save_as(output_path)

Troubleshooting Common Issues

Missing or Incorrect Metadata

def safe_get_attribute(dcm, attr, default=None):
    """Safely get DICOM attribute with fallback."""
    try:
        value = getattr(dcm, attr, default)
        if value is None or (isinstance(value, str) and value.strip() == ''):
            return default
        return value
    except:
        return default

# Example: handle missing pixel spacing
pixel_spacing = safe_get_attribute(dcm, 'PixelSpacing', [1.0, 1.0])

Inconsistent Series

def validate_series(dicom_files):
    """Check that DICOM files form a consistent series."""
    issues = []

    ref_dcm = pydicom.dcmread(dicom_files[0], stop_before_pixels=True)
    ref_series_uid = ref_dcm.SeriesInstanceUID
    ref_rows = ref_dcm.Rows
    ref_cols = ref_dcm.Columns

    for f in dicom_files[1:]:
        dcm = pydicom.dcmread(f, stop_before_pixels=True)

        if dcm.SeriesInstanceUID != ref_series_uid:
            issues.append(f"{f}: Different SeriesInstanceUID")
        if dcm.Rows != ref_rows or dcm.Columns != ref_cols:
            issues.append(f"{f}: Inconsistent dimensions")

    return issues

Slice Ordering Problems

def get_sorted_slices(directory):
    """Sort DICOM slices handling common edge cases."""
    slices = []

    for f in Path(directory).glob("*"):
        try:
            dcm = pydicom.dcmread(f, stop_before_pixels=True)

            # Try ImagePositionPatient first
            if hasattr(dcm, 'ImagePositionPatient'):
                z_pos = float(dcm.ImagePositionPatient[2])
            # Fall back to SliceLocation
            elif hasattr(dcm, 'SliceLocation'):
                z_pos = float(dcm.SliceLocation)
            # Last resort: InstanceNumber
            elif hasattr(dcm, 'InstanceNumber'):
                z_pos = float(dcm.InstanceNumber)
            else:
                z_pos = 0

            slices.append((f, z_pos))
        except:
            continue

    # Sort by z position
    slices.sort(key=lambda x: x[1])
    return [s[0] for s in slices]

Further Resources

  • DICOM Standard: https://www.dicomstandard.org/ — The official specification
  • pydicom Documentation: https://pydicom.github.io/ — Python DICOM library
  • SimpleITK: https://simpleitk.org/ — Robust medical image I/O
  • MONAI: https://monai.io/ — PyTorch framework for medical imaging
  • Innolitics DICOM Browser: https://dicom.innolitics.com/ — Interactive tag reference