Nonogram AI

Nonograms (also known as Picross, Griddlers, Pic-a-Pix, and various other names) are picture logic puzzles, in which the cells in the grid must be colored or left blank according to the numbers on the side of the grid to display hidden pictures.

Figure 1. The initial and complete state of a 25×25 grid Nonogram.
(Source: https://www.nonograms.org/nonograms/i/32344 )

For a classical type of game, the numbers are a form of discrete tomography that measures how many unbroken lines of filled-in squares there are in any given row or column. For example, a clue of “1 2 3” would mean there are sets of one, two, and three filled squares, in that order, with at least one blank square between successive sets.

Solving Nonograms can be very time-consuming, and can be tremendously brain-twisting as the grid size increases, thus I created an AI program for automatically solving these puzzles.

The program utilizes the depth-first search algorithm that runs recursively from the top to bottom row. For the jth column of the ith row, the black or blank spaces must satisfy the tomographic rules formed by the numbers of the current row and column. This is a relatively simple approach for calculating the answer. However, because the algorithm does not solve the game using a more intuitive method for solving interrelated constraints, it will use a lot of time for big grids (over 40×40).

For the input format of this program, two numbers indicating the number of rows and columns are given first (n and m). Afterwards, there are n rows of data, with each row k starting with a number nk indicating how many numbers are given in that row, and followed by nk numbers representing the discrete tomography of black spaces of that very row; then there are m rows of data, and also starting with a number mk for the kth row, and followed by mk numbers representing the discrete tomography of black spaces for the kth column.

Here’s an example input for the Nonogram in Fig. 1:

25 25
1 8
2 7 3
1 16
2 11 4
2 13 2
2 14 2
1 18
2 8 4
2 6 4
2 5 5
3 4 2 2
3 4 3 1
3 3 2 1
2 3 2
2 3 2
3 2 1 4
3 2 1 4
3 2 1 4
3 3 1 4
2 5 4
1 11
1 10
1 5
1 5
1 6
1 1
1 2
1 2
1 4
1 11
1 13
1 15
2 9 3
2 8 3
3 7 5 2
3 7 3 4
3 6 2 3
3 8 3 2
2 13 2
2 10 2
3 5 5 3
4 4 1 4 6
4 1 2 2 8
3 3 1 10
4 3 1 4 4
4 2 1 2 3
3 2 1 2
2 2 1
2 2 1
1 1

A demonstration for using the AI program for solving a 39×50 Nonogram in about 22 seconds is shown below. The result displays an owl sitting on a branch (Fig. 2).

Figure 2. The complete state of a 39×50 Nonogram displaying an owl on a branch.

[Download program for the Nonogram solver AI]

[Source code for the Nonogram solver AI]

Othello AI

Othello (a variant of the traditional game Reversi) is a two-player strategy board game played on an 8×8 square field, where each player takes turns placing black or white pieces and capturing the other player’s pieces.

Figure 1. Othello board game.

The Othello AI program is the second board game AI that I have written in junior high school (the first is the Tic-Tac-Toe AI). Just like the programming strategy for the tic-tac-toe AI, I used the logic thinking experiences that I had learned while playing this game, and hard-coded some summarized strategies into this program.

Compared with the tic-tac-toe AI, which has a game-tree complexity of 9! = 362880, Othello is much more complex, yielding a stunning complexity of approximately 1058. This makes the game still mathematically unsolved up to this very day. Therefore, instead of calculating the definite winning strategy, this Othello AI rather tries to find relative advantageous points (e.g. the corners), and moves that will temporarily maximize the number of pieces which it occupies by the coded algorithm, making it a beginner ~ intermediate level AI.

Here is a demonstration of using the Othello AI program:

[Download Othello AI program]

[Source code for the Othello AI program (Visual Basic 6.0 form file)]

Sudoku AI

Sudoku is a commonly known logic puzzle game, and had been one of my favorite puzzle games.

Having the experience for creating the simple Tic-Tac-Toe AI, I started to take on this challenge for creating an AI that can solve Sudoku.

Figure 1. The initial and complete board state of an expert-level Sudoku game.

I had written this AI program when I was in junior high school using Visual Basic 6.0. Having no prior knowledge for algorithms and data structures (which I learned in senior high school), I came up with a rough version of the backtracking algorithm myself, and implemented it on the AI. Here, the backtracking algorithm is a kind of depth-first search (DFS), and is guaranteed for finding a valid solution. Constraint programming is also integrated with the backtracking algorithm, since it is a very intuitive way for solving the puzzle.

Here is a clip demonstrating the Sudoku Solver program solving a hard-level Sudoku in around 5 seconds:

Here is another one that solved an expert-level Sudoku in around 44 seconds:

[Download program for the Sudoku Solver]

[Source code for the Sudoku Solver (Visual Basic 6.0 form file)]

Structural Biology Simulation

The usage of proteins is almost inevitable in most biochemical experiments. The ironic thing is, even if several billion or trillion proteins are present right in front of us, we never really get to see their true form due to their microscopic sizes. Thus, I enrolled in a class named structural biology, which I learned four programs: PyMOL, Swiss PDB Viewer, MolMol, and Chimera, for visualizing proteins, their physical properties, and several interaction mechanisms. This helped me understand important structural properties about the protein I had been studying.

Here I demonstrate some simulation methods implemented on the protein: signal transducer and activator of transcription 3 (STAT3). Three PDB files are used in this project: 3cwg, 1bg1, and 1bf5.

Analysis 1: Protein visualization using cartoon (top-left), dots (top-right), sticks (bottom-left) and spheres (bottom-right). Secondary structures such as the alpha helix and beta sheet are colored differently (PDB ID: 3cwg) (PyMOL).

Analysis 2: The volume of the protein (PDB ID: 1bg1) is calculated as 71.613nm3, and its surface area is calculated as 263.23nm2. The structure is transformed into a spherical molecular representation prior to calculation (Chimera).

Analysis 3: Total width and height of the protein (PDB ID: 3cwg) (Swiss PDB Viewer).

Analysis 4: Morphing between two different PDB files of the same protein (PDB ID: 3cwg and 1bg1) (Chimera). The blue structure is 3cwg, and the gray-white structure is 1bg1 in the lower figure.

Analysis 5: Electric charge on alpha helix (PDB ID: 3cwg) (PyMol).

Analysis 6: Mutation of Proline to Histidine at residue 255 (PDB ID: 3cwg) (Swiss PDB Viewer).

Analysis 7: Twisting of the ϕ and ψ angle (respectively left and right figure) at residue 255 (Proline) (PDB ID: 3cwg) (Swiss PDB Viewer).

Analysis 8: Ramachandran plots of the same protein with two different PDB files (PDB ID: 3cwg (left figure) and 1bg1 (right figure)) (MolMol).

Analysis 9: Coulomb force on protein surface. The surface is colored from red (-10kcal/mol×e) to blue (10kcal/mol×e) gradient in order to indicate differences in Coulombic forces (PDB ID: 3cwg) (Chimera).

Analysis 10: The hydrogen bond between the two SH2 domains of the STAT3 dimer (PDB ID: 1bg1) (Chimera).

Analysis 11: Morphing between STAT3 (1bg1) and STAT1 (1bf5) (another similar protein of the STAT family) (Chimera). The blue structure is STAT1, and the white structure is STAT3 in the lower figure.

Fermentation Batch Reactor

For most of what we experience in everyday life, it is rare that one can directly link the obvious outcomes with their underlying theoretical grounds. Equations and plots seem such a long distance toward their practical applications. I regard this project as an important one which links observations of a simple experiment to the complex differential equations in reaction mechanics. This mini-project comes from a homework in reaction engineering, a course I had enrolled in during college. The experiment is simple that any person can carry out using easily accessible materials. The main objective is to construct a batch reactor that can exhibit fermentation with yeast, then quantify the reactions using what we have learned on class. (~age 21, 2017)

Two commercially available sugar-sweetened beverages, glucose solution, and water are used to explored how the sugar content in them affects the fermentation rate of rapid yeast. The anaerobic fermentation of yeast in anaerobic environment is:

C6H12O6 (monosaccharide) → 2C2H5OH (ethanol) + 2CO2 (carbon dioxide) + 2ATP

In this experiment, glass containers are filled with the solutions, then instant yeast is added the each container for production of carbon dioxide. A balloon is used for trapping the gases and is used as a volume sensor, where its dimensions are measured for calculating the volume of generated CO2. The molar concentration of CO2 is calculated using the ideal gas equation PV = nRT, and the ethanol production rate is calculated by relating with the proposed reaction and using finite different method.

Figure 1. Snapshots of balloon-sealed containers with added yeast at different times.

Figure 2. Volume of balloon (Vballoon) vs time (min).

Assume an inner air pressure of P = 1atm, a temperature of T = 310K (37°C). From the ideal gas equation, the relationship between the number of moles of CO2 and its volume is n = 3.931×10-5V, which according to the reaction, also equals the number of moles of ethanol. The molar concentration of ethanol is calculated by dividing the number of moles by its volume. And by using finite difference method of the first derivative, the rate of increase for molar concentration of ethanol (rC2H5OH) is calculated (Fig. 3).

Figure 3. Increase rate of molar concentration of ethanol (rC2H5OH) vs time (min).

It can be seen that in addition to pure water (Negative), the other three sugar-containing solutions have a maximum formation rate at the beginning (marked by the blue arrow). Wherein the ethanol production rate of glucose solution is eventually lower than 0(mM/min), it is presumed either this is caused by measurement errors or that carbon dioxide is dissolved back into the liquid, causing a decrease in volume, not a decrease in the amount of ethanol.

Here the production rate of ethanol in glucose solution started at a very high value (8.29mM/min), followed by fruit tea (4.17mM/min), and then raspberry juice (3.36mM/min). However, the sugar concentration of raspberry juice is higher than that of fruit tea. There are two factors that may be affected: the type of sugar and the pH value. Among them, the pH of fruit tea is between 5.0 and 6.0 and the pH of raspberry juice is between 2.3 and 2.52. However, the optimal living environment pH of yeast is 4.5 to 5.0, so it is speculated that the acidic environment of raspberry juice inhibits the activity of yeast and reduces rC2H5OH. In addition, only glucose exists in the glucose solution, but there is sucrose in both raspberry juice and fruit tea. Sucrose can be broken down by the yeast and producing ethanol twice as much as the same concentration of glucose. This explains why the final balloon volume (408.69cm3) of fruit tea is greater than the final balloon volume of the glucose solution (361.03cm3).

As being a simple hands-on experiment, this project successfully delivered the knowledge and allowed me to learn the fundamentals through practice, by which creating a connection between reality and theory.

Gomoku AI

Although entering a department different from Computer Science (CS), I still had a majority of acquainted people in NTU-CS. One day, a friend of mine who was studying in CS challenged me of making an AI of some board game. The two of us would make AIs of that game and try to win the other one using the AI. Thinking it might be some challenging but interesting subject, I started to make this Gomoku AI.

Figure 1. Game playing demonstration of the Gomoku AI program.

Being a board game with so many possible states (more than the Othello AI I had made when I was in junior high school, which only has a 8×8 grid) , it is considered overly time wasting for the AI to traverse all the possible combinations, even just within 10 moves! Game tree graphs, graph traversal algorithms (e.g. DFS, BFS) and optimal searching algorithms (e.g. Alpha–beta pruning, A*) are usually implemented for decreasing computational time and achieving the best strategy. In this project, I only used the depth-first search (DFS) algorithm, along with some hard coded optimization decisions to create this program. Although being able to place a few initial pieces successfully, the program suffers from a large amount of computational time being spent. Most of the time after a few moves, the program will just run for a few hours (or days) before taking the next move, which is obviously very a serious problem for the AI.

Nevertheless, the environment and rules of the Gomoku game engine was successfully constructed, which allowed two players to play against each other. Nowadays, these board game AI are usually designed using deep neural networks, such as AlphaGomoku, an Alpha-Go-based Gomoku AI. Constructing these AIs would be a possible future work for optimizing this game agent.

Tic-Tac-Toe AI

Artificial intelligence (AI) has long inspired me, and had spurred me on to come up with interesting programs or projects that I had not imagined before. Since great oaks from little acorns grow, this tic-tac-toe AI project is the first and one of the most important programs I have made that can be said to possess artificial intelligence. (~age 15, 2011)

Either with the player playing first or the computer, the AI will never lose! Being a very simple game that one could easily master, tic-tac-toe can yet be quite complex in a way in that there is actually a total of 255168 possible outcomes! Fig. 1 illustrates all the possible board states until the 5th move using an optimal strategy with the 1st move at the center (States that can arise from mirroring or rotating the shown states are excluded). The green circled states are those which the circles have won, and the yellow ones are those which the circles will eventually win while continuing to implement the optimal strategy. We can see that it actually gets quite complicated after the 3rd move.

Figure 1. Possible states of Tic-Tac-Toe until the 5th move starting from the center.

Since back then I knew nothing about algorithms (which I had started to learn at senior high school), I hard coded almost all the conditions using very basic syntax: If…Else…End If and For…Loop. The AI will sometimes move randomly, but it will always follow an optimized strategy. It took me ~650 lines of code and several days of hard work to complete.

This project demonstrated the possibility of programs to achieve human level performance for playing simple games. Although objectively not considered an astonishing one, I was surely amazed and unprecedentedly inspired by the potential of programming algorithms for AI; which further on motivated me to create AI programs of increasing difficulties (refer to Gomoku AI, Sudoku AI, Othello AI, Nonogram AI, Guess the Number AI, Minesweeper AI).