Ultrasound image-based contrastive fusion non-invasive liver fibrosis staging algorithm

Contrastive learning

The core idea of contrastive learning is to use deep-learning models to find the differences in the abstract level of features instead of focusing on semantic associations in specific data. In contrastive learning, we first need to enhance the original data, such as random cropping, contrast changes, and exposure changes, to enhance the model’s generalization ability. Then, the Encoder is trained through the Contrastive loss function to encode features of the same type closer. Consequently, the features of different types of data are more scattered. Closeness and dispersion can be directly reflected in the classification results.

Compared with the traditional convolutional neural network, models adopting the contrastive learning method consider more correlations between data. Traditional convolutional neural networks treat each sample as an isolated special case during training; different samples are not correlated. Samples with similar or different semantic features are considered equally in the training process, and only the label obtained at the end of the sample is used as the basis for the loss function. If the Contrastive Loss value can correctly reflect whether the samples have commonality, the features from the Encoder contain this part of the information.

In this study, we used a grid search strategy to adjust key hyperparameters during model training, including learning rate (1e-4 to 1e-2), batch size (16, 32, 64), and optimizer (Adam and SGD). In addition, to prevent overfitting, we adopted early stopping (patiently waiting for 10 epochs). Data augmentation strategies (such as random cropping, brightness modification, contrast adjustment, and slight rotation) were consistently in contrastive learning and throughout the model training process. In addition, we also applied dropout layers to the fully connected layers with a dropout rate of 0.5. Models for all experiments were trained for more than 100 epochs, and the best-performing models were selected based on the validation loss. These operations are intended to improve model robustness and reduce the risk of overfitting, especially when training under limited sample conditions.

FCL

Most of the image features in the ultrasound signal have no decisive role in assessing the grade of liver fibrosis. They are mostly used to describe the characteristics of the liver itself. However, it is obvious that liver tissue with different grades of fibrosis has the same liver-characteristic information, which is a redundant feature for fibrosis staging. This article introduces the concept of FCL to find information that minimizes the difference between the expression levels of fibrosis.

Figure 1 provides a schematic diagram of the FCL structure. We designed a specific attention mechanism for the characteristics of liver fibrosis to judge auxiliary characteristics, because we found that in the process of liver texture assessment, the different information between the textures contributes to the result differently [31]. The same texture that most liver fibrosis-grade images display contributes less to the image assessment. We use an attention mechanism to address this aspect of the feature.

$$X_ \, = \,x\, + \,RB\,\left( x \right)$$

(1)

$$}\,\,\alpha }_}}}$$

(3)

Fig. 1figure 1

Schematic diagram of the liver Fibrosis Contrast Layer (FCL) model structure based on the attention mechanism

As shown in Formula 1, which is the core principle of the Attention mechanism, RB represents the residual processing module [32] for feature reflecting and then uses Batch Normalization. We then employed a method similar to SENET [33] to use a two-layer FB structure to learn to reflect the features to the important space to obtain improved features. In our work, FB and RB can take many forms, including but not limited to multilayer perceptrons, convolutional structures, and fully connected structures.

After extracting core classification features using the FCL method, the subsequent feature processing step must only respond to a small amount of data. Compared with the feature processing of the traditional convolutional neural network, a decision can be made with fewer parameters. Consequently, the model’s generalization ability will be stronger, while fewer samples are required to determine the model’s parameters. After training, the model presented in this article performs significantly better than traditional algorithms when evaluating small data sets.

Contrastive loss

$$\text= __,_\sim ^)}(_,_\right)-1\right)}^]+__,_\sim ^)}_,_)}^]$$

(4)

Contrastive loss function is defined by Formula 4, which \(}}^}}\) represents a combination of positive samples, that is, a combination of features of liver fibrosis features of the same stage, and \(}}^}}\) represents a combination of features of liver fibrosis of different stages. D represents the designed decoder structure. We intend that the Decoder can learn the correlation between two liver fibrosis data inputs of the same stage and distinguish the difference between different liver fibrosis grade data. The definition of a Decoder can take many forms, such as multi-layer perceptrons or convolutional structures.

Connection with contrastive learning

Our FCL design is related to the traditional contrastive learning algorithm; however, the core contrastive level has been changed significantly. Traditional machine learning algorithms use their multiple enhancements or between their enhancements and themselves as positive samples. They use their own data categories or enhancements as negative samples. This article’s approach to contrastive learning is to sample under the same label as positive samples. Our study aims to use comparative learning to find features that can distinguish fibrosis grades.

In contrast to the application scenarios of traditional comparative learning algorithms, there are limited fibrosis features in the same category. Traditional comparative learning only uses itself or its enhancement as the most relevant sample because the samples under the same category in its application scenarios are quite different. For example, in the traditional cat and dog classification task, many different forms of cat images are under the cat category, which varies considerably. However, as ultrasound signals are collected in a similar equipment environment, similar differences between fibrosis levels are less apparent than in traditional image classification. Consequently, inter-class data can be used for comparative learning.

Label fusion

$$\begin}^}=softmax\left(\text+\uplambda }_}_}}\right)\\ L= \, }}}}}}}}}}}} \, \left(\text}^}\right)+}_}\end$$

(5)

A progressive relationship and correlation exists between the data of different liver fibrosis grades. Before giving the model structure, we initialize an Embedding vector corresponding to different categories to represent the characteristics of the category. In the training process, we will calculate the similarity between the corresponding feature and Embedding and assume that we have obtained the most typical features in each category as our Embedding value through training. We can obtain the relationship between the sample and the liver grade Embedding through the similarity calculation method, thereby obtaining a more accurate feature expression. The data demonstrated in Fig. 2 use cosine similarity and lambda as a demonstration. Formula 5 displays the final loss function result after adding the LF structure.

Fig. 2figure 2

Feature processing diagram

Comments (0)

No login
gif