Convolutional_Neural_Network_Classifier

For training the CNN peroxisome model, we utilized Pex13 mutant cells, which expressed GFP-Cat signals throughout their cytoplasm.

The Classification diverged into three parts:

  1. Set training data

  2. Model Generation

  3. Evaluation

Set training data

The GFP-Cat images were then segmented into individual cells and saved into two separate folders - one for phenotypic images and another for normal peroxisomes.

from AIPyS import AIPS_cellpose as AC

For example, we used the CellPose model for segmenting the images (as described in section 2):

AIPS_pose_object = AC.AIPS_cellpose(Image_name = 'catGFP.tif', path = 'data', model_type = 'cyto', channels=[0,0])
img = AIPS_pose_object.cellpose_image_load()

The function stackObjects_cellpose_ebimage_parametrs_method similar to the R packadge EBimage (publish by Pau et. al. ) brekes the mask input into 150 pixel single-cell images.

AIPS_pose_object = AC.stackObjects_cellpose_ebimage_parametrs_method(Image_name = 'catGFP.tif', path = 'data',                                                                          model_type = 'cyto', channels=[0,0])
img = AIPS_pose_object.cellpose_image_load()
mask, table = AIPS_pose_object.cellpose_segmantation(image_input=img[0,:,:])

##### Than EBimage like stacking function is used

stack, _ = AIPS_pose_object.stackObjects_cellpose_ebimage_parametrs_method(image_input=img[0, :, :],
                                                                           extract_pixel=50,
                                                                           resize_pixel=150,
                                                                           img_label=table.index.values[i])


  • Pau G, Fuchs F, Sklyar O, Boutros M, Huber W (2010). “EBImage—an R package for image processing with applications to cellular phenotypes.” Bioinformatics, 26(7), 979–981. doi: 10.1093/bioinformatics/btq046.

The single-cell images are to be organized into a training structure consisting of three separate directories, training_data, validation_data and test_data, each containing two folders; one for positive samples (pheno) and one for negative samples (norm). The data is then ready to be used for machine learning model training and validation.

from AIPyS import Taining_data_orgenizer as orgenizer 


pathInput = '/input_sc_mix'
pathOrigen = '/data/training_set'
labelA = 'norm'
labelB = 'pheno'
file_extention = 'png'


path_builder = orgenizer.classification_data_orgenizer(path_input = pathInput,
                                                       path_origen = pathOrigen,
                                                       label_A=labelA,
                                                       label_B =labelB,
                                                       file_extention =file_extention)

path_builder.get_file_names_list()

statment_a, statment_b, train_files, validate_files, test_files = path_builder.split_traning_set_and_copy()

Directory structure:

training_set
│
└───training_data
|       │phno01.png
│       │norm01.png
│       │...
│   
└───validation_data
|       │pheno02.png
│       │norm02.png
│       │...
└───test_data
        │pheno03.png
        │norm03.pn
        │...
Model Generation

The plotform contain four CNN models.

  • Basic CNN model

  • Basic CNN model with data Augmentation

  • Transfer learning drop layer 4 and 5

  • Transfer learning with Augmentation freez all layer

from AIPyS import model_builder as mb

We set hyperparamters for the model

train_dir = os.path.join(path_origen, 'training_data')
val_dir = os.path.join(path_origen, 'validation_data')
test_dir = os.path.join(path_origen, 'test_data')
batch  = 30
epoch  = 50
step_per_epoch = int((9930)/30)
validation_steps = int((1242)/30)
path_model = '/data/models'
IMG_DIM=(150,150,3)
imbalance_train = 921
imbalance_val = 115
model_name = '10precent.h5'
path_checkpoints = '/data/models/chakpoints_10p/'

Intiate model bulider:

model_build = mb.model_builder(IMG_DIM=(150,150,3),path_training=train_dir,path_validation=val_dir,
                 batch=batch, epoch = epoch,input_shape = (150,150,3) ,steps_per_epoch_sel= step_per_epoch,
                 validation_steps=validation_steps,path_model = path_model,file_extention = 'png',
                 extract_size_train = extract_size_train, extract_size_val=extract_size_val)

Image files are loaded and converted to tf tensor.

TRimgScale,ValimgScale,TRlabels,Valabels,TRimg,Valimg,report = model_build.build_image__sets()
print report

tarin labels:[‘norm’, ‘pheno’, ‘norm’, ‘norm’, ‘norm’], train_labels_enc:[0 1 0 0 0].

Train models:

cnn_basic = model_build.model_cnn_basic()
cnn_basic_Augmentation = model_build.model_cnn_basic_Augmentation()
cnn_transfer_learning_Augmentation_drop_layer_4and5  =           model_build.model_cnn_transfer_learning_Augmentation_drop_layer_4and5()
transfer_learning_aug_dropout_freez_all = model_build.model_cnn_transfer_learning_Augmentation_freez_all()
Evaluation

Assess the efficacy of the model constructed by assessing its performance. First, the test files will be converted into tensors of model input pixel size.

test_imgs = [img_to_array(load_img(img, target_size=IMG_DIM)) for img in test_files]
test_imgs = np.array(test_imgs)
test_imgs_scaled = test_imgs.astype('float32')
test_imgs_scaled /= 255

The labels are assigned an integer value to be encoded.

test_labels = [fn.split('/')[1].split('_')[0].strip() for fn in test_files]
num2class_label_transformer = lambda l: ['norm' if x[0] == 0 else 'pheno' for x in l]
class2num_label_transformer = lambda l: [0 if x == 'pheno' else 1 for x in l]
test_labels_enc = class2num_label_transformer(test_labels[:5])

print('Test dataset shape:{}'.format(test_imgs.shape))
print(test_labels[0:5], test_labels_enc[0:5])

    Test dataset shape:(5967, 150, 150, 3)
    ['norm', 'pheno', 'norm', 'pheno', 'norm'] [1, 0, 1, 0, 1]


```python
predictions = cnn_transfer_learning_Augmentation_drop_layer_4and5.predict(test_imgs_scaled, verbose=0)
plt.hist(predictions)

png

classes_x=[np.where(lab >0.5,1,0).tolist() for lab in predictions]
predictions_label = num2class_label_transformer(classes_x)
predictions_label
meu.get_metrics(test_labels, predictions_label)