## Electrochemical and Biological Simulation for Microfluidics

My research as a graduate student focuses on biosensing applications using electrochemical impedimetric methods. Unlike mechatronic systems, these applications consider a dynamic environment at such a microscale, it is quite hard to perceive what is really going on when we merely recognize the change of physical property. I usually ask myself: What is the underlying mechanism? Tucked away from the limit of horizon of the human eye, we often see nothing happening when doing these experiments.

That is why simulation is so important! In this project, I performed 3 simulation tasks in order to visualize the changes of important physical properties for my study (impedimetric microfluidic chip for biosensing) using COMSOL Multiphysics.

1. Velocity field simulation inside microfluidic channel.
2. Real-time molecule immobilization on gold surface.
3. Electrochemical impedance spectroscopy (EIS) simulation.

The result of task 1 supports the hypotheses of task 2 and 3.

Physical Environment Setting

The simulation environment is set as the interior of a microfluidic channel with gold microelectrodes. See the film below for visualization.

The chip is fabricated using soft lithography and photolithography. The microfluidic channel has a width of 1mm and a height of 100μm at the center. The gold microelectrodes form a pair for square pads (300×300μm) at the center of the channel (Fig. 1).

Figure 1. Microfabrication process, dimensions, and microscopic view of the microfluidic electrode chip used in this project.

Velocity Field Simulation Inside Microfluidic Channel

The objective of this task is to simulate the fluid velocity field inside the channel on a sliced plane. Due to the fact that 3D simulation is time-consuming, if a 3D environment can be reduced to a 2D environment, then a large amount of time can be saved. By performing this task, it can be seen if dimension reduction modeling of task 2 and 3 are feasible.

Considering the physical nature of the microfluidic channel, a laminar flow model is implemented along with the Navier-Stokes equation:

$$\rho(\textbf{u}\cdot\nabla)\textbf{u}=\nabla\cdot[-pI+\mu(\nabla \textbf{u}+(\nabla \textbf{u})^T)]+F$$

, which means a balance between inertia ($$\rho(\textbf{u}\cdot\nabla)\textbf{u}$$), pressure ($$-pI$$), viscous ($$\mu(\nabla \textbf{u}+(\nabla \textbf{u})^T)$$) and external ($$F$$) forces. A stationary study is implemented ($$\rho \nabla \cdot \textbf{u}=0$$), and water is set as the fluid.

Fig. 2 shows the 3D velocity field inside the channel. A bigger arrow indicates a larger velocity magnitude. The reason that the velocity is faster at the center is because of the lower channel height.

Figure 2. 3D velocity field inside microfluidic channel.

Fig. 3 shows animated 2D velocity fields of sliced planes inside the channel. Due to laminar flow, velocities near the boundary get close to zero. However, the steady-state velocity reaches a constant value away from the boundary.

Figure 3. 2D velocity fields for xy and zy sliced planes.

A top view of the velocity field and the velocity at different x positions are shown in Fig. 4. It can be concluded that at the center of the channel where the microelectrodes lie, the fluid velocity stays constant, and subsequent tasks can be carried out using 2D models.

Figure 4. Top-view and x position-dependent velocity magnitude of fluid.

Real-time Molecule Immobilization on Gold Surface.

For microfluidic electrochemical biosensors, it is quite common that immobilization of sensing elements takes place at the center of the channel on electrode surfaces (e.g. Au). In this task, molecules are simulated to flow past the channel, and be immobilized on the electrode pad at the bottom center. A slice on the zy plane is used as the modeling geometry of this task (Fig. 5).

Figure 5. Modeling geometry used in task 2. A sliced area of the microfluidic channel with pad electrode at the bottom center is used.

Here, the molecules convect and diffuse near the surface, the rate of immobilization is determined by several factors including the inlet concentration (c0), diffusion coefficient (D), maximum surface molecule density (Γs). The convection-diffusion equation and transport-adsorption equation are used along with time-dependent study:

$${{\partial c} \over {\partial t}} + \nabla \cdot (-D\nabla c) + \textbf{u} \cdot \nabla c = R$$

$${{\partial c_s} \over {\partial t}} + \nabla \cdot (-D\nabla c_s) = k_{ads} c(\Gamma_s – c_s)-k_{des} c_s$$

For the transport-adsorption equation, it is assumed that the change of surface concentration plus the rate of surface diffusion equals the rate of Langmuir adsorption isotherm.

Fig. 6 shows the time-dependent surface concentration (cs) change when c0 = 1μM, and Fig. 7 shows the binding curve (probe density vs time) for different values of c0.

Figure 6. Time-dependent surface concentration (cs) change. t = 0~18hr.

Figure 7. Probe surface density (molecules/cm2) vs time (hr).

At concentrations above 0.1μM, the probe density almost saturates to a value of 9.6×1012 molecules/cm2 before immobilizing for 10 hours. The result highly resembles a typical binding curve, suggesting the possibility for computer simulating assisted optimization of in vitro parameters, which is really helpful for understanding underlying mechanisms for the system.

Electrochemical Impedance Spectroscopy (EIS) Simulation.

EIS is a rapid and label-free method for detection of bio-molecules, and is widely implemented on a variety of biosensors. In this task, I simulated EIS diagrams by changing the values of the heterogeneous rate constant (k0) and the double layer capacitance (Cdl). Both Cdl and k0 are affected by the immobilized surface molecule density on an electrode surface, and are important physical properties when analyzing EIS data. A slice on the xy plane is used as the modeling geometry of this task (Fig. 8).

Figure 8. Geometry being simulated for task 3. A 2D plane is sliced in the xy direction.

Here, a sinusoidal voltage wave is applied between the two electrodes (amplitude ≅ 5mV), and Bode plots and Nyquist plots can be drawn according to the measured impedance. According to the surface redox reaction, an equivalent circuit can be constructed. The equivalent circuit for this task is shown in Fig. 9.

Figure 9. Equivalent circuit used in my research and task 3.

Fick’s 2nd law and Butler-Volmer equation are used for simulation:

$${{\partial c}\over{\partial t}} = \nabla \cdot (D\nabla c)$$

$$j = nFk_0 (c_{Red} e^{ {(n-\alpha_c)F\eta} \over {RT} } – c_{Ox} e^{ {-\alpha_c F\eta} \over {RT} })$$

EIS plots are simulated for different values of k0 and Cdl (Fig. 10).

Figure 10. Bode and Nyquist plots for the simulated EIS data. k0 has a range from 0.001 to 0.1 (cm/s), and Cdl has a range from 0.01 to 100 (uF/cm2).

By undergoing this simulation project, I furthermore understood some fundamental interactions between the physical properties and outcome of my research to a new depth, and developed new concepts about how to improve it.

After completing this project, I also used COMSOL for simulating time-dependent concentration gradient variation of redox molecules in my 1st author journal paper “Diffusion impedance modeling for interdigitated array electrodes by conformal mapping and cylindrical finite length approximation”.

## Survival Rate Prediction Model for Startup Companies

This is an end semester project that I have done with my groupmates in college. In this study, we proposed a model to predict a startup company’s future condition using a deep multilayer perceptron (MLP) and decision tree. I mainly played the role for data quantification and literature review of this study. In our study, we built a prediction analysis model for startup companies. Furthermore, we identified what the key factors are and what influences will it have on the results. Our main hypothesis is that money, people and active days are the key factors. We built the prediction model from CrunchBase, which is the largest public database with relevant profiles about companies.

Data Quantification

For the data obtained, there are 4 groups of data that are quantified: country state, employee range, roles and countries. Regarding the states group (which only exists in USA and Canada), we think that different states have different impacts on the survival rate of a startup. A lookup table for scoring of states is defined [1][2][3]. The original data for employee quantities are a range between two numbers (e.g. 101-250). These are transformed into the average of the upper and lower bound. Employees of 10000+ are defined to be 15000. For the roles group, the data “company” is arbitrarily defined as 0.1 and “company, investor” is defined as 0.9. This is because we regard a company as wealthier and influential when it also plays a role of an investor, compared with only being a company. The other roles are transformed to 0.5. The impact of country on startup environment is also studied [4] and scores from 0.329 to 0.947 are given to countries that have above 300 startup companies in record.

Table 1. Company status and scores of 23 countries.

Neural Network Implementation

The ANN runs on a Windows 10 OS (i7-8700k CPU) with DDR4 2666MHz 16G RAM and Nvidia GTX 1070 Ti GPU. Keras is used to construct the network, and three networks of multilayer perceptron (MLP) with different number of layers are designed for comparison. There are 4 outputs of the network, indicating the probabilities of the final status which are: (1) Closed, (2) Operating, (3) Acquired and (4) IPO.

Figure 1. MLP Structure of the 3 neural networks.

The three networks have respectively 2, 4 and 6 hidden layers, with each network all starting with a 1024-neuron hidden layer and ending with a 32-neuron hidden layer (Fig. 1). Table 2 shows the training results.

Table 2. Mean square error (MSE) and categorical cross-entropy loss for the 3 networks.

The results show that almost all results are close to 70% with few difference. In general, ReLU activation has a better result than sigmoid activation, while linear activation may outperform ReLU when the network is deep enough.

The neural network behaves like a black box. It is quite difficult to conclude significant insights just by looking at the trained parameters. However, regarding the importance for business, we used a decision tree to help us understand which factor is the most important and how important they are respectively.

Decision Tree Training

We tried three different depths of decision tree: 4 (Fig. 2), 6 (Fig. 3), and 10. We set the gain ratio to be the criterion, and the confidence to 0.1.

Figure 2. Decision tree of depth = 4.

Figure 3. Decision tree of depth = 6.

The results show that if a company is large enough to exceed 500 people, the close rates are low. In most cases, large companies are acquired by mergers and acquisitions. In addition to the number of employees, funding total amount also has a significant impact on whether a company eventually survives or closes.

Key Findings

Among the factors regarding the ability to survive of starting up companies, the factors such as the number of employees, funding total amount, and the active days have significant influences on the company’s survivability. On the contrary, the factors such as country, region or number of funding rounds do not have significant influences. Whether a company acts as an investor simultaneously will also have influences on whether the company will become an IPO or will be acquired.

Future Work

Our future work is expected to integrate the inspirative insight gained from our case study with the methodology of our own. Three items are listed as in the following:

1. Construct a heterogeneous relationship network for survival rate prediction [5].
2. Define a data path score according to HeteSim algorithm [6][7].
3. Predict company survival rate using MLP, decision tree and other neural networks.
4. Predict how much money a company will raise.

References

1. Bill Murphy , The Start-up Hall of Shame (America’s 10 Worst States for Entrepreneurs), © 2018 Manuseto Ventures, inc.com/bill-murphy-jr/the-startup-hall-of-shame-americas-10-worst-states-for-entrepreneurs.html
2. Bill Murphy , 10 Top States for Entrepreneurship and Innovation, © 2018 Manuseto Ventures, inc.com/bill-murphy-jr/ranking-the-10-top-states-for-entrepreneurship-and-innovation.html
3. Enterprising States: States Innovate, © 2015 The U.S. Chamber of Commerce Foundation, www.uschamberfoundation.org/enterprisingstates/
assets/files/Executive-Summary-OL.pdf
4. Zameena Mejia, The top 10 best countries for entrepreneurs in 2018, © 2019 CNBC LLC,
https://www.cnbc.com/2018/02/05/
us-world-news-report-2018-top-10-best-countries-for-entrepreneurs.html
5. Xiangxiang Zeng, You Li, Stephen C.H. Leung, Ziyu Lin, Xiangrong Liu, Investment behavior prediction in heterogeneous information network, Neurocomputing, Volume 217, 2016, Pages 125-132
6. Sun, Y., & Han, J. (2012). Mining Heterogeneous Information Networks: Principles and Methodologies. Synthesis Lectures on Data Mining and Knowledge Discovery, 3(2), 1-159
7. Shi, C., Kong, X., Huang, Y., Yu, P. S., & Wu, B. (2014). HeteSim: A General Framework for Relevance Measure in Heterogeneous Networks. IEEE Transactions on Knowledge and Data Engineering, 26(10), 2479-2492. [6702458].

## Structural Biology Simulation

The usage of proteins is almost inevitable in most biochemical experiments. The ironic thing is, even if several billion or trillion proteins are present right in front of us, we never really get to see their true form due to their microscopic sizes. Thus, I enrolled in a class named structural biology, which I learned four programs: PyMOL, Swiss PDB Viewer, MolMol, and Chimera, for visualizing proteins, their physical properties, and several interaction mechanisms. This helped me understand important structural properties about the protein I had been studying.

Here I demonstrate some simulation methods implemented on the protein: signal transducer and activator of transcription 3 (STAT3). Three PDB files are used in this project: 3cwg, 1bg1, and 1bf5.

Analysis 1: Protein visualization using cartoon (top-left), dots (top-right), sticks (bottom-left) and spheres (bottom-right). Secondary structures such as the alpha helix and beta sheet are colored differently (PDB ID: 3cwg) (PyMOL).

Analysis 2: The volume of the protein (PDB ID: 1bg1) is calculated as 71.613nm3, and its surface area is calculated as 263.23nm2. The structure is transformed into a spherical molecular representation prior to calculation (Chimera).

Analysis 3: Total width and height of the protein (PDB ID: 3cwg) (Swiss PDB Viewer).

Analysis 4: Morphing between two different PDB files of the same protein (PDB ID: 3cwg and 1bg1) (Chimera). The blue structure is 3cwg, and the gray-white structure is 1bg1 in the lower figure.

Analysis 5: Electric charge on alpha helix (PDB ID: 3cwg) (PyMol).

Analysis 6: Mutation of Proline to Histidine at residue 255 (PDB ID: 3cwg) (Swiss PDB Viewer).

Analysis 7: Twisting of the ϕ and ψ angle (respectively left and right figure) at residue 255 (Proline) (PDB ID: 3cwg) (Swiss PDB Viewer).

Analysis 8: Ramachandran plots of the same protein with two different PDB files (PDB ID: 3cwg (left figure) and 1bg1 (right figure)) (MolMol).

Analysis 9: Coulomb force on protein surface. The surface is colored from red (-10kcal/mol×e) to blue (10kcal/mol×e) gradient in order to indicate differences in Coulombic forces (PDB ID: 3cwg) (Chimera).

Analysis 10: The hydrogen bond between the two SH2 domains of the STAT3 dimer (PDB ID: 1bg1) (Chimera).

Analysis 11: Morphing between STAT3 (1bg1) and STAT1 (1bf5) (another similar protein of the STAT family) (Chimera). The blue structure is STAT1, and the white structure is STAT3 in the lower figure.

## Fermentation Batch Reactor

For most of what we experience in everyday life, it is rare that one can directly link the obvious outcomes with their underlying theoretical grounds. Equations and plots seem such a long distance toward their practical applications. I regard this project as an important one which links observations of a simple experiment to the complex differential equations in reaction mechanics. This mini-project comes from a homework in reaction engineering, a course I had enrolled in during college. The experiment is simple that any person can carry out using easily accessible materials. The main objective is to construct a batch reactor that can exhibit fermentation with yeast, then quantify the reactions using what we have learned on class. (~age 21, 2017)

Two commercially available sugar-sweetened beverages, glucose solution, and water are used to explored how the sugar content in them affects the fermentation rate of rapid yeast. The anaerobic fermentation of yeast in anaerobic environment is:

C6H12O6 (monosaccharide) → 2C2H5OH (ethanol) + 2CO2 (carbon dioxide) + 2ATP

In this experiment, glass containers are filled with the solutions, then instant yeast is added the each container for production of carbon dioxide. A balloon is used for trapping the gases and is used as a volume sensor, where its dimensions are measured for calculating the volume of generated CO2. The molar concentration of CO2 is calculated using the ideal gas equation PV = nRT, and the ethanol production rate is calculated by relating with the proposed reaction and using finite different method.

Figure 1. Snapshots of balloon-sealed containers with added yeast at different times.

Figure 2. Volume of balloon (Vballoon) vs time (min).

Assume an inner air pressure of P = 1atm, a temperature of T = 310K (37°C). From the ideal gas equation, the relationship between the number of moles of CO2 and its volume is n = 3.931×10-5V, which according to the reaction, also equals the number of moles of ethanol. The molar concentration of ethanol is calculated by dividing the number of moles by its volume. And by using finite difference method of the first derivative, the rate of increase for molar concentration of ethanol (rC2H5OH) is calculated (Fig. 3).

Figure 3. Increase rate of molar concentration of ethanol (rC2H5OH) vs time (min).

It can be seen that in addition to pure water (Negative), the other three sugar-containing solutions have a maximum formation rate at the beginning (marked by the blue arrow). Wherein the ethanol production rate of glucose solution is eventually lower than 0(mM/min), it is presumed either this is caused by measurement errors or that carbon dioxide is dissolved back into the liquid, causing a decrease in volume, not a decrease in the amount of ethanol.

Here the production rate of ethanol in glucose solution started at a very high value (8.29mM/min), followed by fruit tea (4.17mM/min), and then raspberry juice (3.36mM/min). However, the sugar concentration of raspberry juice is higher than that of fruit tea. There are two factors that may be affected: the type of sugar and the pH value. Among them, the pH of fruit tea is between 5.0 and 6.0 and the pH of raspberry juice is between 2.3 and 2.52. However, the optimal living environment pH of yeast is 4.5 to 5.0, so it is speculated that the acidic environment of raspberry juice inhibits the activity of yeast and reduces rC2H5OH. In addition, only glucose exists in the glucose solution, but there is sucrose in both raspberry juice and fruit tea. Sucrose can be broken down by the yeast and producing ethanol twice as much as the same concentration of glucose. This explains why the final balloon volume (408.69cm3) of fruit tea is greater than the final balloon volume of the glucose solution (361.03cm3).

As being a simple hands-on experiment, this project successfully delivered the knowledge and allowed me to learn the fundamentals through practice, by which creating a connection between reality and theory.