adnipy package

Submodules

adnipy.adni module

Pandas dataframe extension for ADNI.

class adnipy.adni.ADNI(pandas_dataframe: DataFrame)[source]

Bases: object

Dataframe deals with ADNI data.

This class presents methods, which are designed to work with data from the ADNI database.

DATES: ClassVar[list[str]] = ['Acq Date', 'Downloaded', 'EXAMDATE', 'EXAMDATE_bl', 'update_stamp', 'USERDATE', 'update_stamp', 'USERDATE', 'USERDATE2', 'SCANDATE', 'TAUTRANDT', 'update_stamp', 'USERDATE', 'USERDATE2', 'SCANDATE', 'TRANDATE', 'update_stamp']

INDEX: ClassVar[list[str]] = ['Subject ID', 'Image ID']

MAPPER: ClassVar[dict[str, str]] = {'ASSAYTIME': 'TAUTIME', 'Acq Date': 'SCANDATE', 'Image': 'Image ID', 'Image Data ID': 'Image ID', 'PTID': 'Subject ID', 'Subject': 'Subject ID'}

drop_dynamic() → DataFrame[source]

Remove images which are dynamic.

Drops all rows, in which the Description contains ‘Dynamic’.

Returns:: A dataframe with only non-dynamic images.
Return type:: pd.DataFrame

groups(*, grouped_mci: bool = True) → dict[str, DataFrame][source]

Create a dataframe for each group and save it to a csv file.

Parameters:: grouped_mci (bool, default True) – If true, ‘LMCI’ and ‘EMCI’ are treated like ‘MCI’. However, the original values will stills be in the output.
Returns:: Dictionnairy with a dataframe for each group.
Return type:: dict

longitudinal() → DataFrame[source]

Keep only longitudinal data.

This requires an ‘RID’ or ‘Subject ID’ column in the dataframe. Do not use if multiple images are present for a single timepoint.

Parameters:: images (pd.DataFrame) – This dataframe will be modified.
Returns:: A dataframe with only longitudinal data.
Return type:: pd.DataFrame

See also

drop_dynamic

rid() → DataFrame[source]

Add a roster ID column.

Will not work if ‘RID’ is already present or ‘Subject ID’ is missing.

Returns:: Dataframe with a ‘RID’ column.
Return type:: pd.DataFrame

Examples

>>> subjects = {"Subject ID": ["100_S_1000", "101_S_1001"]}
>>> collection = pd.DataFrame(subjects)
>>> collection
   Subject ID
0  100_S_1000
1  101_S_1001
>>> collection.adni.rid()
   Subject ID   RID
0  100_S_1000  1000
1  101_S_1001  1001

standard_column_names() → DataFrame[source]

Rename dataframe columns to module standard.

This function helps when working with multiple dataframes, since the same data can have different names. It will also call rid() on the dataframe.

Returns:: This will have standardized columns names.
Return type:: pd.DataFrame

See also

rid

Examples

>>> subjects = pd.DataFrame({"Subject": ["101_S_1001", "102_S_1002"]})
>>> subjects
      Subject
0  101_S_1001
1  102_S_1002
>>> subjects.adni.standard_column_names()
   Subject ID   RID
0  101_S_1001  1001
1  102_S_1002  1002

>>> images = pd.DataFrame({"Image": [100001, 100002]})
>>> images
    Image
0  100001
1  100002
>>> images.adni.standard_column_names()
   Image ID
0    100001
1    100002

standard_dates() → DataFrame[source]

Change type of date columns to datetime.

Returns:: Dates will have the appropriate dtype.
Return type:: pd.DataFrame

standard_index(index: list[str] | None = None) → DataFrame[source]

Process dataframes into a standardized format.

The output is easy to read. Applying functions the the output may not work as expected.

Parameters:: index (list of str, default None) – These columns will be the new index.
Returns:: An easy to read dataframe for humans.
Return type:: pd.DataFrame

timepoints(second: Literal['first', 'last'] = 'first') → dict[str, DataFrame][source]

Extract timepoints from a dataframe.

Parameters:: second ({'first' or 'last'}, default 'first') – ‘last’ to have the latest, ‘first’ to have the earliest values for timepoint 2.

adnipy.adnipy module

Process ADNI study data with adnipy.

adnipy.adnipy.get_matching_images(left: DataFrame, right: DataFrame) → DataFrame[source]

Match different scan types based on closest date.

The columns ‘Subject ID’ and ‘SCANDATE’ are required.

Parameters:

left (pd.DataFrame) – Dataframe containing the tau scans.
right (pd.DataFrame) – Dataframe containing the mri scans.

Returns:

For each timepoint there is a match from both inputs.

Return type:

pd.DataFrame

adnipy.adnipy.read_csv(file: str | StringIO) → DataFrame[source]

Return a csv file as a pandas.DataFrame.

Recognizes missing values used in the ADNI database.

Parameters:: file (str, pathlib.Path) – The path to the .csv file.
Returns:: Returns the file as a dataframe.
Return type:: pd.DataFrame

See also

standard_column_names, standard_dates, standard_index

adnipy.adnipy.timedelta(old: DataFrame, new: DataFrame) → Series[source]

Get timedelta between timepoints.

Parameters:

old (pd.DataFrame) – This is the older dataframe.
new (pd.DataFrame) – This is the newer dataframe.

Returns:

The content will be timedelta values. Look into numpy for more options.

Return type:

pd.Series

adnipy.data module

Process data created in Matlab.

adnipy.data.image_id_from_filename(filename: str) → int[source]

Extract image ID of single ADNI .nii filename.

Images from the ADNI database have a specific formatting. Using regular expressions the image ID can be extracted from filenames.

Parameters:: filename (str) – It must contain the Image ID at the end.
Returns:: Image as a integer.
Return type:: numpy.int64

Examples

>>> image_id_from_filename("*_I123456.nii")
123456

Module contents

Process ADNI study data with adnipy.

class adnipy.ADNI(pandas_dataframe: DataFrame)[source]

Bases: object

Dataframe deals with ADNI data.

This class presents methods, which are designed to work with data from the ADNI database.

DATES: ClassVar[list[str]] = ['Acq Date', 'Downloaded', 'EXAMDATE', 'EXAMDATE_bl', 'update_stamp', 'USERDATE', 'update_stamp', 'USERDATE', 'USERDATE2', 'SCANDATE', 'TAUTRANDT', 'update_stamp', 'USERDATE', 'USERDATE2', 'SCANDATE', 'TRANDATE', 'update_stamp']

INDEX: ClassVar[list[str]] = ['Subject ID', 'Image ID']

MAPPER: ClassVar[dict[str, str]] = {'ASSAYTIME': 'TAUTIME', 'Acq Date': 'SCANDATE', 'Image': 'Image ID', 'Image Data ID': 'Image ID', 'PTID': 'Subject ID', 'Subject': 'Subject ID'}

drop_dynamic() → DataFrame[source]

Remove images which are dynamic.

Drops all rows, in which the Description contains ‘Dynamic’.

Returns:: A dataframe with only non-dynamic images.
Return type:: pd.DataFrame

groups(*, grouped_mci: bool = True) → dict[str, DataFrame][source]

Create a dataframe for each group and save it to a csv file.

Parameters:: grouped_mci (bool, default True) – If true, ‘LMCI’ and ‘EMCI’ are treated like ‘MCI’. However, the original values will stills be in the output.
Returns:: Dictionnairy with a dataframe for each group.
Return type:: dict

longitudinal() → DataFrame[source]

Keep only longitudinal data.

This requires an ‘RID’ or ‘Subject ID’ column in the dataframe. Do not use if multiple images are present for a single timepoint.

Parameters:: images (pd.DataFrame) – This dataframe will be modified.
Returns:: A dataframe with only longitudinal data.
Return type:: pd.DataFrame

See also

drop_dynamic

rid() → DataFrame[source]

Add a roster ID column.

Will not work if ‘RID’ is already present or ‘Subject ID’ is missing.

Returns:: Dataframe with a ‘RID’ column.
Return type:: pd.DataFrame

Examples

>>> subjects = {"Subject ID": ["100_S_1000", "101_S_1001"]}
>>> collection = pd.DataFrame(subjects)
>>> collection
   Subject ID
0  100_S_1000
1  101_S_1001
>>> collection.adni.rid()
   Subject ID   RID
0  100_S_1000  1000
1  101_S_1001  1001

standard_column_names() → DataFrame[source]

Rename dataframe columns to module standard.

This function helps when working with multiple dataframes, since the same data can have different names. It will also call rid() on the dataframe.

Returns:: This will have standardized columns names.
Return type:: pd.DataFrame

See also

rid

Examples

>>> subjects = pd.DataFrame({"Subject": ["101_S_1001", "102_S_1002"]})
>>> subjects
      Subject
0  101_S_1001
1  102_S_1002
>>> subjects.adni.standard_column_names()
   Subject ID   RID
0  101_S_1001  1001
1  102_S_1002  1002

>>> images = pd.DataFrame({"Image": [100001, 100002]})
>>> images
    Image
0  100001
1  100002
>>> images.adni.standard_column_names()
   Image ID
0    100001
1    100002

standard_dates() → DataFrame[source]

Change type of date columns to datetime.

Returns:: Dates will have the appropriate dtype.
Return type:: pd.DataFrame

standard_index(index: list[str] | None = None) → DataFrame[source]

Process dataframes into a standardized format.

The output is easy to read. Applying functions the the output may not work as expected.

Parameters:: index (list of str, default None) – These columns will be the new index.
Returns:: An easy to read dataframe for humans.
Return type:: pd.DataFrame

timepoints(second: Literal['first', 'last'] = 'first') → dict[str, DataFrame][source]

Extract timepoints from a dataframe.

Parameters:: second ({'first' or 'last'}, default 'first') – ‘last’ to have the latest, ‘first’ to have the earliest values for timepoint 2.

adnipy.get_matching_images(left: DataFrame, right: DataFrame) → DataFrame[source]

Match different scan types based on closest date.

The columns ‘Subject ID’ and ‘SCANDATE’ are required.

Parameters:

left (pd.DataFrame) – Dataframe containing the tau scans.
right (pd.DataFrame) – Dataframe containing the mri scans.

Returns:

For each timepoint there is a match from both inputs.

Return type:

pd.DataFrame

adnipy.read_csv(file: str | StringIO) → DataFrame[source]

Return a csv file as a pandas.DataFrame.

Recognizes missing values used in the ADNI database.

Parameters:: file (str, pathlib.Path) – The path to the .csv file.
Returns:: Returns the file as a dataframe.
Return type:: pd.DataFrame

See also

standard_column_names, standard_dates, standard_index

adnipy.timedelta(old: DataFrame, new: DataFrame) → Series[source]

Get timedelta between timepoints.

Parameters:

old (pd.DataFrame) – This is the older dataframe.
new (pd.DataFrame) – This is the newer dataframe.

Returns:

The content will be timedelta values. Look into numpy for more options.

Return type:

pd.Series