Simulation Scheme: There are 128 sensors evenly spaced. Width of each sensor is 0.27 mm and empty space between each sensor is 0.03 mm. We assume that there is no attenuation. Image has 50 mm depth(in z direction) and 38 mm width(in x direction). Bubble density is chosen as 260 bubbles per cm2. Transducer frequency is 5 Mhz. There is one single plane wave. Pixel sizes are \( \frac{ \lambda}{8} =0.0385 mm\) in x direction and \( \frac{ \lambda}{20} = 0.0154mm \) in z direction. Total number of pixels in x direction is 990 and total pixels in z direction is 3247.
Training process: Training is done using patches. Lets have some definitions as follows:
x: a patch form ground truth image ( \( 64pixels\times 64pixels \)) ( \(1.97mm \times 4.8mm\) )
y: a patch form Field2 simulation image ( \( 128pixels\times 128pixels \)) (\(1.97mm \times 4.8mm\))
z: output of the network
f : blur kernel (Gaussian kernel with sigma=2 in pixel coordinates)
\[ w = z * f \]
\[ v = x*f \]
Then the training lost can be expressed as follows:
\[ loss = MSEloss(z-v) + \lambda \times L1loss(z)\]
stepsize = 2e-5
\[ \lambda = 0.01\]
Network Structure: Our network is based U-Net with batch normalization and drop out layer.
RESULTS
Note: In all loss graphs, the first losses are eliminated since first training losses are generally huge. By doing that, we can see the behavior better.
Training without label:
The following results for training separately;
Region 1:

Training Loss = 0.19711332617912392
Test Loss = 0.20458045625227494
Region 2:

Training Loss = 0.1960891251286879
Test Loss = 0.19649085655380477
Region 3:

Training Loss = 0.19749677761950912
Test Loss = 0.20406845757323835
Training with all regions;

Training Loss = 0.19985176214470254
Test Loss = 0.20604501453760174
Improvement coming from using three different networks instead of one network is more than %2. It is calculated as follows:
\[ Improvement = \frac{ \frac{Test Loss1 + Test Loss2 + Test Loss3 }{3} }{Test LossAll} \]
Training with label:
The following results for training separately;
Region 1:

Training Loss at epoch 300 = 0.33138459768415635
Test Loss at epoch 300= 0.5276035935579455
Region 2:

Training Loss at epoch 300= 0.27107622946405535
Test Loss at epoch 300= 0.45211572744749856
Region 3:

Training Loss at epoch 300= 0.3044815735109272
Test Loss at epoch 300= 0.48998950448715023
Training with all regions;

Training Loss = 0.32412794625265046
Test Loss = 0.39815186476483977
Miscellaneous Experiments
1) I changed the training loss as follows:
\[ loss = MSEloss(w-v) + \lambda \times L1loss(z) \]
The following results for training separately;
Region 1:

Training Loss at epoch 300 = 0.22519103461185688
Test Loss at epoch 300 = 0.23249626391369335
Region 2:

Training Loss at epoch 300= 0.22230404627391986
Test Loss at epoch 300= 0.22300771080132564
Region 3:

Training Loss at epoch 300= 0.22351187953247623
Test Loss at epoch 300= 0.2273676454813944
Training with all regions;

Training Loss at epoch 90= 0.22636351862931922
Test Loss at epoch 90= 0.23172340765902044
2) I remove the lass CNN layer named as self.conv9()

Training Loss = 0.2272135959966761
Test Loss = 0.2332808526618654
3) I changed the near field patches label to 0 from 1.

Training Loss = 0.30359066570625276
Test Loss = 0.5433338370370797
For a complete description of the experiment, specify
For a complete description of the experiment, specify f.
Also, you write
” x: a patch form ground truth image ( 64pixels×64pixels) ( 1.97mm×4.8mm )
y: a patch form Field2 simulation image ( 128pixels×128pixels) (1.97mm×4.8mm) ”
x and y seem to have different pixel sizes, but you say nothing about this. Also, what do you do with x and y? What about the idea of a square pixel in the output of the network?