Pydicom Reference Guide

Common pydicom functions called by user code

File Reading/Parsing

The main function to read and parse DICOM files using pydicom is read_file. It is coded in the module dicom.filereader, but is also imported when the pydicom package is imported:

>>> import pydicom
>>> dataset = pydicom.read_file(...)

If you need fine control over the reading, you can either call read_partial or use open_dicom. All are documented below:

pydicom.filereader.read_file(fp, defer_size=None, stop_before_pixels=False, force=False)[source][source]

Read and parse a DICOM dataset stored in the DICOM File Format.

Read a DICOM dataset stored in accordance with the DICOM File Format (DICOM Standard Part 10 Section 7). If the dataset is not stored in accordance with the File Format (i.e. the preamble and prefix are missing, there are missing required Type 1 File Meta Information Group elements or the entire File Meta Information is missing) then you will have to set force to True.

Parameters:

fp : str or file-like

Either a file-like object, or a string containing the file name. If a file-like object, the caller is responsible for closing it.

defer_size : int or str or None

If None (default), all elements read into memory. If specified, then if a data element’s stored value is larger than defer_size, the value is not read into memory until it is accessed in code. Specify an integer (bytes), or a string value with units, e.g. “512 KB”, “2 MB”.

stop_before_pixels : bool

If False (default), the full file will be read and parsed. Set True to stop before reading (7FE0,0010) ‘Pixel Data’ (and all subsequent elements).

force : bool

If False (default), raises an InvalidDicomError if the file is missing the File Meta Information header. Set to True to force reading even if no File Meta Information header is found.

Returns:

FileDataset

An instance of FileDataset that represents a parsed DICOM file.

Raises:

InvalidDicomError

If force is True and the file is not a valid DICOM file.

See also

pydicom.dataset.FileDataset
Data class that is returned.
pydicom.filereader.read_partial
Only read part of a DICOM file, stopping on given conditions.

Examples

Read and return a dataset stored in accordance with the DICOM File Format >>> ds = pydicom.read_file(“rtplan.dcm”) >>> ds.PatientName

Read and return a dataset not in accordance with the DICOM File Format >>> ds = pydicom.read_file(“rtplan.dcm”, force=True) >>> ds.PatientName

Use within a context manager: >>> with pydicom.read_file(“rtplan.dcm”) as ds: >>> ds.PatientName

pydicom.filereader.read_partial(fileobj, stop_when=None, defer_size=None, force=False)[source][source]

Parse a DICOM file until a condition is met.

Parameters:

fileobj : a file-like object

Note that the file will not close when the function returns.

stop_when :

Stop condition. See read_dataset for more info.

defer_size : int, str, None, optional

See read_file for parameter info.

force : boolean

See read_file for parameter info.

Returns:

FileDataset instance or DicomDir instance.

See also

read_file
More generic file reading function.

Notes

Use read_file unless you need to stop on some condition other than reaching pixel data.

File Writing

DICOM files can also be written using pydicom. There are two ways to do this. The first is to use write_file with a prexisting FileDataset (derived from Dataset) instance. The second is to use the save_as method on an Dataset instance.

pydicom.filewriter.write_file(filename, dataset, write_like_original=True)[source][source]

Write dataset to the filename specified.

If write_like_original is True then dataset will be written as is (after minimal validation checking) and may or may not contain all or parts of the File Meta Information (and hence may or may not be conformant with the DICOM File Format). If write_like_original is False, dataset will be stored in the DICOM File Format in accordance with DICOM Standard Part 10 Section 7. The byte stream of the dataset will be placed into the file after the DICOM File Meta Information.

Parameters:

filename : str or file-like

Name of file or the file-like to write the new DICOM file to.

dataset : pydicom.dataset.FileDataset

Dataset holding the DICOM information; e.g. an object read with pydicom.read_file().

write_like_original : bool

If True (default), preserves the following information from the Dataset (and may result in a non-conformant file): - preamble – if the original file has no preamble then none will be

written.

  • file_meta – if the original file was missing any required File Meta
    Information Group elements then they will not be added or written. If (0002,0000) ‘File Meta Information Group Length’ is present then it may have its value updated.
  • seq.is_undefined_length – if original had delimiters, write them now
    too, instead of the more sensible length characters
  • is_undefined_length_sequence_item – for datasets that belong to a
    sequence, write the undefined length delimiters if that is what the original had.

If False, produces a file conformant with the DICOM File Format, with explicit lengths for all elements.

See also

pydicom.dataset.FileDataset
Dataset class with relevant attributes and information.
pydicom.dataset.Dataset.save_as
Write a DICOM file from a dataset that was read in with read_file(). save_as wraps write_file.
Dataset.save_as(filename, write_like_original=True)[source][source]

Write the Dataset to filename.

Saving a Dataset requires that the Dataset.is_implicit_VR and Dataset.is_little_endian attributes exist and are set appropriately. If Dataset.file_meta.TransferSyntaxUID is present then it should be set to a consistent value to ensure conformance.

Parameters:

filename : str or file-like

Name of file or the file-like to write the new DICOM file to.

write_like_original : bool

If True (default), preserves the following information from the Dataset (and may result in a non-conformant file): - preamble – if the original file has no preamble then none will be

written.

  • file_meta – if the original file was missing any required File
    Meta Information Group elements then they will not be added or written. If (0002,0000) ‘File Meta Information Group Length’ is present then it may have its value updated.
  • seq.is_undefined_length – if original had delimiters, write them
    now too, instead of the more sensible length characters
  • is_undefined_length_sequence_item – for datasets that belong to a
    sequence, write the undefined length delimiters if that is what the original had.

If False, produces a file conformant with the DICOM File Format, with explicit lengths for all elements.

See also

pydicom.filewriter.write_dataset
Write a DICOM Dataset to a file.
pydicom.filewriter.write_file_meta_info
Write the DICOM File Meta Information Group elements to a file.
pydicom.filewriter.write_file
Write a DICOM file from a FileDataset instance.

Dataset

class pydicom.dataset.Dataset(*args, **kwargs)[source][source]

A collection (dictionary) of DICOM DataElements.

Examples

Add DataElements to the Dataset (for elements in the DICOM dictionary). >>> ds = Dataset() >>> ds.PatientName = “CITIZEN^Joan” >>> ds.add_new(0x00100020, ‘LO’, ‘12345’) >>> ds[0x0010, 0x0030] = DataElement(0x00100030, ‘DA’, ‘20010101’)

Add Sequence DataElement to the Dataset >>> ds.BeamSequence = [Dataset(), Dataset(), Dataset()] >>> ds.BeamSequence[0].Manufacturer = “Linac, co.” >>> ds.BeamSequence[1].Manufacturer = “Linac and Sons, co.” >>> ds.BeamSequence[2].Manufacturer = “Linac and Daughters, co.”

Add private DataElements to the Dataset >>> ds.add(DataElement(0x0043102b, ‘SS’, [4, 4, 0, 0])) >>> ds.add_new(0x0043102b, ‘SS’, [4, 4, 0, 0]) >>> ds[0x0043, 0x102b] = DataElement(0x0043102b, ‘SS’, [4, 4, 0, 0])

Updating and retrieving DataElement values >>> ds.PatientName = “CITIZEN^Joan” >>> ds.PatientName ‘CITIZEN^Joan” >>> ds.PatientName = “CITIZEN^John” >>> ds.PatientName ‘CITIZEN^John’

Retrieving a DataElement’s value from a Sequence >>> ds.BeamSequence[0].Manufacturer ‘Linac, co.’ >>> ds.BeamSequence[1].Manufacturer ‘Linac and Sons, co.’

Retrieving DataElements >>> elem = ds[0x00100010] >>> elem = ds.data_element(‘PatientName’) >>> elem (0010, 0010) Patient’s Name PN: ‘CITIZEN^Joan’

Deleting a DataElement from the Dataset >>> del ds.PatientID >>> del ds.BeamSequence[1].Manufacturer >>> del ds.BeamSequence[2]

Deleting a private DataElement from the Dataset >>> del ds[0x0043, 0x102b]

Determining if a DataElement is present in the Dataset >>> ‘PatientName’ in ds True >>> ‘PatientID’ in ds False >>> 0x00100030 in ds True >>> ‘Manufacturer’ in ds.BeamSequence[0] True

Iterating through the top level of a Dataset only (excluding Sequences) >>> for elem in ds: >>> print(elem)

Iterating through the entire Dataset (including Sequences) >>> for elem in ds.iterall(): >>> print(elem)

Recursively iterate through a Dataset (including Sequences) >>> def recurse(ds): >>> for elem in ds: >>> if elem.VR == ‘SQ’: >>> [recurse(item) for item in elem] >>> else: >>> # Do something useful with each DataElement

Attributes

default_element_format (str) The default formatting for string display.
default_sequence_element_format (str) The default formatting for string display of sequences.
indent_chars (str) For string display, the characters used to indent nested Sequences. Default is ” ”.

Methods

add
add_new
clear
copy Generic (shallow and deep) copying operations.
data_element
decode
dir
formatted_lines
fromkeys
get
get_item
group_dataset
items
iterall
keys
pop
popitem
remove_private_tags
save_as
setdefault
top
trait_names
update
values
walk
add(data_element)[source][source]

Add a DataElement to the Dataset.

Equivalent to ds[data_element.tag] = data_element

Parameters:

data_element : pydicom.dataelem.DataElement

The DataElement to add to the Dataset.

add_new(tag, VR, value)[source][source]

Add a DataElement to the Dataset.

Parameters:

tag

The DICOM (group, element) tag in any form accepted by pydicom.tag.Tag such as [0x0010, 0x0010], (0x10, 0x10), 0x00100010, etc.

VR : str

The 2 character DICOM value representation (see DICOM standard part 5, Section 6.2).

value

The value of the data element. One of the following: * a single string or number * a list or tuple with all strings or all numbers * a multi-value string with backslash separator * for a sequence DataElement, an empty list or list of Dataset

clear() → None. Remove all items from D.
copy() → a shallow copy of D
data_element(name)[source][source]

Return the DataElement corresponding to the element keyword name.

Parameters:

name : str

A DICOM element keyword.

Returns:

pydicom.dataelem.DataElement or None

For the given DICOM element keyword, return the corresponding Dataset DataElement if present, None otherwise.

decode()[source][source]

Apply character set decoding to all DataElements in the Dataset.

See DICOM PS3.5-2008 6.1.1.

dir(*filters)[source][source]

Return an alphabetical list of DataElement keywords in the Dataset.

Intended mainly for use in interactive Python sessions. Only lists the DataElement keywords in the current level of the Dataset (i.e. the contents of any Sequence elements are ignored).

Parameters:

filters : str

Zero or more string arguments to the function. Used for case-insensitive match to any part of the DICOM keyword.

Returns:

list of str

The matching DataElement keywords in the dataset. If no filters are used then all DataElement keywords are returned.

formatted_lines(element_format='%(tag)s %(name)-35.35s %(VR)s: %(repval)s', sequence_element_format='%(tag)s %(name)-35.35s %(VR)s: %(repval)s', indent_format=None)[source][source]

Iterate through the Dataset yielding formatted str for each element.

Parameters:

element_format : str

The string format to use for non-sequence elements. Formatting uses the attributes of DataElement. Default is “%(tag)s %(name)-35.35s %(VR)s: %(repval)s”.

sequence_element_format : str

The string format to use for sequence elements. Formatting uses the attributes of DataElement. Default is “%(tag)s %(name)-35.35s %(VR)s: %(repval)s”

indent_format : str or None

Placeholder for future functionality.

Yields:

str

A string representation of a DataElement.

fromkeys()

Returns a new dict with keys from iterable and values equal to value.

get(key, default=None)[source][source]

Extend dict.get() to handle DICOM DataElement keywords.

Parameters:

key : str or pydicom.tag.Tag

The element keyword or Tag or the class attribute name to get.

default : obj or None

If the DataElement or class attribute is not present, return default (default None).

Returns:

value

If key is the keyword for a DataElement in the Dataset then return the DataElement’s value.

pydicom.dataelem.DataElement

If key is a tag for a DataElement in the Dataset then return the DataElement instance.

value

If key is a class attribute then return its value.

get_item(key)[source][source]

Return the raw data element if possible.

It will be raw if the user has never accessed the value, or set their own value. Note if the data element is a deferred-read element, then it is read and converted before being returned.

Parameters:

key

The DICOM (group, element) tag in any form accepted by pydicom.tag.Tag such as [0x0010, 0x0010], (0x10, 0x10), 0x00100010, etc.

Returns:

pydicom.dataelem.DataElement

group_dataset(group)[source][source]

Return a Dataset containing only DataElements of a certain group.

Parameters:

group : int

The group part of a DICOM (group, element) tag.

Returns:

pydicom.dataset.Dataset

A dataset instance containing elements of the group specified.

items() → a set-like object providing a view on D's items
iterall()[source][source]

Iterate through the Dataset, yielding all DataElements.

Unlike Dataset.__iter__, this does recurse into sequences, and so returns all data elements as if the file were “flattened”.

Yields:pydicom.dataelem.DataElement
keys() → a set-like object providing a view on D's keys
pixel_array

Return the Pixel Data as a NumPy array.

Returns:

numpy.ndarray

The Pixel Data (7FE0,0010) as a NumPy ndarray.

pop(k[, d]) → v, remove specified key and return the corresponding value.

If key is not found, d is returned if given, otherwise KeyError is raised

popitem() → (k, v), remove and return some (key, value) pair as a

2-tuple; but raise KeyError if D is empty.

remove_private_tags()[source][source]

Remove all private DataElements in the Dataset.

save_as(filename, write_like_original=True)[source][source]

Write the Dataset to filename.

Saving a Dataset requires that the Dataset.is_implicit_VR and Dataset.is_little_endian attributes exist and are set appropriately. If Dataset.file_meta.TransferSyntaxUID is present then it should be set to a consistent value to ensure conformance.

Parameters:

filename : str or file-like

Name of file or the file-like to write the new DICOM file to.

write_like_original : bool

If True (default), preserves the following information from the Dataset (and may result in a non-conformant file): - preamble – if the original file has no preamble then none will be

written.

  • file_meta – if the original file was missing any required File
    Meta Information Group elements then they will not be added or written. If (0002,0000) ‘File Meta Information Group Length’ is present then it may have its value updated.
  • seq.is_undefined_length – if original had delimiters, write them
    now too, instead of the more sensible length characters
  • is_undefined_length_sequence_item – for datasets that belong to a
    sequence, write the undefined length delimiters if that is what the original had.

If False, produces a file conformant with the DICOM File Format, with explicit lengths for all elements.

See also

pydicom.filewriter.write_dataset
Write a DICOM Dataset to a file.
pydicom.filewriter.write_file_meta_info
Write the DICOM File Meta Information Group elements to a file.
pydicom.filewriter.write_file
Write a DICOM file from a FileDataset instance.
setdefault(k[, d]) → D.get(k,d), also set D[k]=d if k not in D
top()[source][source]

Return a str of the Dataset’s top level DataElements only.

trait_names()[source][source]

Return a list of valid names for auto-completion code.

Used in IPython, so that data element names can be found and offered for autocompletion on the IPython command line.

update(dictionary)[source][source]

Extend dict.update() to handle DICOM keywords.

values() → an object providing a view on D's values
walk(callback, recursive=True)[source][source]

Iterate through the DataElements and run callback on each.

Visit all DataElements, possibly recursing into sequences and their datasets. The callback function is called for each DataElement (including SQ element). Can be used to perform an operation on certain types of DataElements. E.g., `remove_private_tags`() finds all private tags and deletes them. DataElement`s will come back in DICOM order (by increasing tag number within their dataset).

Parameters:

callback

A callable that takes two arguments:
  • a Dataset
  • a DataElement belonging to that Dataset

recursive : bool

Flag to indicate whether to recurse into Sequences.