adnipy package
Submodules
adnipy.adni module
Pandas dataframe extension for ADNI.
- class adnipy.adni.ADNI(pandas_dataframe: DataFrame)[source]
Bases:
objectDataframe deals with ADNI data.
This class presents methods, which are designed to work with data from the ADNI database.
- DATES: ClassVar[list[str]] = ['Acq Date', 'Downloaded', 'EXAMDATE', 'EXAMDATE_bl', 'update_stamp', 'USERDATE', 'update_stamp', 'USERDATE', 'USERDATE2', 'SCANDATE', 'TAUTRANDT', 'update_stamp', 'USERDATE', 'USERDATE2', 'SCANDATE', 'TRANDATE', 'update_stamp']
- INDEX: ClassVar[list[str]] = ['Subject ID', 'Image ID']
- MAPPER: ClassVar[dict[str, str]] = {'ASSAYTIME': 'TAUTIME', 'Acq Date': 'SCANDATE', 'Image': 'Image ID', 'Image Data ID': 'Image ID', 'PTID': 'Subject ID', 'Subject': 'Subject ID'}
- drop_dynamic() DataFrame[source]
Remove images which are dynamic.
Drops all rows, in which the Description contains ‘Dynamic’.
- Returns:
A dataframe with only non-dynamic images.
- Return type:
pd.DataFrame
- groups(*, grouped_mci: bool = True) dict[str, DataFrame][source]
Create a dataframe for each group and save it to a csv file.
- Parameters:
grouped_mci (bool, default True) – If true, ‘LMCI’ and ‘EMCI’ are treated like ‘MCI’. However, the original values will stills be in the output.
- Returns:
Dictionnairy with a dataframe for each group.
- Return type:
dict
- longitudinal() DataFrame[source]
Keep only longitudinal data.
This requires an ‘RID’ or ‘Subject ID’ column in the dataframe. Do not use if multiple images are present for a single timepoint.
- Parameters:
images (pd.DataFrame) – This dataframe will be modified.
- Returns:
A dataframe with only longitudinal data.
- Return type:
pd.DataFrame
See also
- rid() DataFrame[source]
Add a roster ID column.
Will not work if ‘RID’ is already present or ‘Subject ID’ is missing.
- Returns:
Dataframe with a ‘RID’ column.
- Return type:
pd.DataFrame
Examples
>>> subjects = {"Subject ID": ["100_S_1000", "101_S_1001"]} >>> collection = pd.DataFrame(subjects) >>> collection Subject ID 0 100_S_1000 1 101_S_1001 >>> collection.adni.rid() Subject ID RID 0 100_S_1000 1000 1 101_S_1001 1001
- standard_column_names() DataFrame[source]
Rename dataframe columns to module standard.
This function helps when working with multiple dataframes, since the same data can have different names. It will also call rid() on the dataframe.
- Returns:
This will have standardized columns names.
- Return type:
pd.DataFrame
See also
Examples
>>> subjects = pd.DataFrame({"Subject": ["101_S_1001", "102_S_1002"]}) >>> subjects Subject 0 101_S_1001 1 102_S_1002 >>> subjects.adni.standard_column_names() Subject ID RID 0 101_S_1001 1001 1 102_S_1002 1002
>>> images = pd.DataFrame({"Image": [100001, 100002]}) >>> images Image 0 100001 1 100002 >>> images.adni.standard_column_names() Image ID 0 100001 1 100002
- standard_dates() DataFrame[source]
Change type of date columns to datetime.
- Returns:
Dates will have the appropriate dtype.
- Return type:
pd.DataFrame
- standard_index(index: list[str] | None = None) DataFrame[source]
Process dataframes into a standardized format.
The output is easy to read. Applying functions the the output may not work as expected.
- Parameters:
index (list of str, default None) – These columns will be the new index.
- Returns:
An easy to read dataframe for humans.
- Return type:
pd.DataFrame
adnipy.adnipy module
Process ADNI study data with adnipy.
- adnipy.adnipy.get_matching_images(left: DataFrame, right: DataFrame) DataFrame[source]
Match different scan types based on closest date.
The columns ‘Subject ID’ and ‘SCANDATE’ are required.
- Parameters:
left (pd.DataFrame) – Dataframe containing the tau scans.
right (pd.DataFrame) – Dataframe containing the mri scans.
- Returns:
For each timepoint there is a match from both inputs.
- Return type:
pd.DataFrame
- adnipy.adnipy.read_csv(file: str | StringIO) DataFrame[source]
Return a csv file as a pandas.DataFrame.
Recognizes missing values used in the ADNI database.
- Parameters:
file (str, pathlib.Path) – The path to the .csv file.
- Returns:
Returns the file as a dataframe.
- Return type:
pd.DataFrame
See also
standard_column_names,standard_dates,standard_index
- adnipy.adnipy.timedelta(old: DataFrame, new: DataFrame) Series[source]
Get timedelta between timepoints.
- Parameters:
old (pd.DataFrame) – This is the older dataframe.
new (pd.DataFrame) – This is the newer dataframe.
- Returns:
The content will be timedelta values. Look into numpy for more options.
- Return type:
pd.Series
adnipy.data module
Process data created in Matlab.
- adnipy.data.image_id_from_filename(filename: str) int[source]
Extract image ID of single ADNI .nii filename.
Images from the ADNI database have a specific formatting. Using regular expressions the image ID can be extracted from filenames.
- Parameters:
filename (str) – It must contain the Image ID at the end.
- Returns:
Image as a integer.
- Return type:
numpy.int64
Examples
>>> image_id_from_filename("*_I123456.nii") 123456
Module contents
Process ADNI study data with adnipy.
- class adnipy.ADNI(pandas_dataframe: DataFrame)[source]
Bases:
objectDataframe deals with ADNI data.
This class presents methods, which are designed to work with data from the ADNI database.
- DATES: ClassVar[list[str]] = ['Acq Date', 'Downloaded', 'EXAMDATE', 'EXAMDATE_bl', 'update_stamp', 'USERDATE', 'update_stamp', 'USERDATE', 'USERDATE2', 'SCANDATE', 'TAUTRANDT', 'update_stamp', 'USERDATE', 'USERDATE2', 'SCANDATE', 'TRANDATE', 'update_stamp']
- INDEX: ClassVar[list[str]] = ['Subject ID', 'Image ID']
- MAPPER: ClassVar[dict[str, str]] = {'ASSAYTIME': 'TAUTIME', 'Acq Date': 'SCANDATE', 'Image': 'Image ID', 'Image Data ID': 'Image ID', 'PTID': 'Subject ID', 'Subject': 'Subject ID'}
- drop_dynamic() DataFrame[source]
Remove images which are dynamic.
Drops all rows, in which the Description contains ‘Dynamic’.
- Returns:
A dataframe with only non-dynamic images.
- Return type:
pd.DataFrame
- groups(*, grouped_mci: bool = True) dict[str, DataFrame][source]
Create a dataframe for each group and save it to a csv file.
- Parameters:
grouped_mci (bool, default True) – If true, ‘LMCI’ and ‘EMCI’ are treated like ‘MCI’. However, the original values will stills be in the output.
- Returns:
Dictionnairy with a dataframe for each group.
- Return type:
dict
- longitudinal() DataFrame[source]
Keep only longitudinal data.
This requires an ‘RID’ or ‘Subject ID’ column in the dataframe. Do not use if multiple images are present for a single timepoint.
- Parameters:
images (pd.DataFrame) – This dataframe will be modified.
- Returns:
A dataframe with only longitudinal data.
- Return type:
pd.DataFrame
See also
- rid() DataFrame[source]
Add a roster ID column.
Will not work if ‘RID’ is already present or ‘Subject ID’ is missing.
- Returns:
Dataframe with a ‘RID’ column.
- Return type:
pd.DataFrame
Examples
>>> subjects = {"Subject ID": ["100_S_1000", "101_S_1001"]} >>> collection = pd.DataFrame(subjects) >>> collection Subject ID 0 100_S_1000 1 101_S_1001 >>> collection.adni.rid() Subject ID RID 0 100_S_1000 1000 1 101_S_1001 1001
- standard_column_names() DataFrame[source]
Rename dataframe columns to module standard.
This function helps when working with multiple dataframes, since the same data can have different names. It will also call rid() on the dataframe.
- Returns:
This will have standardized columns names.
- Return type:
pd.DataFrame
See also
Examples
>>> subjects = pd.DataFrame({"Subject": ["101_S_1001", "102_S_1002"]}) >>> subjects Subject 0 101_S_1001 1 102_S_1002 >>> subjects.adni.standard_column_names() Subject ID RID 0 101_S_1001 1001 1 102_S_1002 1002
>>> images = pd.DataFrame({"Image": [100001, 100002]}) >>> images Image 0 100001 1 100002 >>> images.adni.standard_column_names() Image ID 0 100001 1 100002
- standard_dates() DataFrame[source]
Change type of date columns to datetime.
- Returns:
Dates will have the appropriate dtype.
- Return type:
pd.DataFrame
- standard_index(index: list[str] | None = None) DataFrame[source]
Process dataframes into a standardized format.
The output is easy to read. Applying functions the the output may not work as expected.
- Parameters:
index (list of str, default None) – These columns will be the new index.
- Returns:
An easy to read dataframe for humans.
- Return type:
pd.DataFrame
- adnipy.get_matching_images(left: DataFrame, right: DataFrame) DataFrame[source]
Match different scan types based on closest date.
The columns ‘Subject ID’ and ‘SCANDATE’ are required.
- Parameters:
left (pd.DataFrame) – Dataframe containing the tau scans.
right (pd.DataFrame) – Dataframe containing the mri scans.
- Returns:
For each timepoint there is a match from both inputs.
- Return type:
pd.DataFrame
- adnipy.read_csv(file: str | StringIO) DataFrame[source]
Return a csv file as a pandas.DataFrame.
Recognizes missing values used in the ADNI database.
- Parameters:
file (str, pathlib.Path) – The path to the .csv file.
- Returns:
Returns the file as a dataframe.
- Return type:
pd.DataFrame
See also
standard_column_names,standard_dates,standard_index
- adnipy.timedelta(old: DataFrame, new: DataFrame) Series[source]
Get timedelta between timepoints.
- Parameters:
old (pd.DataFrame) – This is the older dataframe.
new (pd.DataFrame) – This is the newer dataframe.
- Returns:
The content will be timedelta values. Look into numpy for more options.
- Return type:
pd.Series