Keywords: processing step, amateur level, position, game, file, output, rid, fen, lichess, rating, guess, play, heat. Powered by TextRank.
This is a small experiment on a visualization of the squares that had the most check mates delivered on. The dataset is from lichess and not filtered on rating. I guess that the most check mates would happen at the castled position at amateur level of play and the heat map confirms that.
First off we need to do some pre-processing for our PGN files. This is not strictly necessary as you can use the excellent python-chess module to parse them but it's a lot faster if we turn it into a pattern matching problem rather that parsing the the whole game. We will use the pgn-extract
utility for this processing step:
!pgn-extract --dropply -1 -V -C -N -F --notags --nomovenumbers -s --linelength 10000 -M lichess_db_standard_rated_2016-09.pgn > matefen
The idea here is to goto the final position of the game --dropply -1
, suppress unnecessary output --notags --nomovenumbers -V -C -N -s
, output a FEN string -F
, make sure everything is on one line (for easier processing) --linelength 10000
and only export games that ended in a checkmate -M
.
Depending on your machine and the number of games you are processing this command could take a while. After it's complete you should have a file like this
dendiz@minipc-GN41:~/tmp$ head matefen Qa7# { "8/q7/8/8/8/8/K1k5/8 w - - 16 69" } 0-1 Qdd1# { "1k6/8/pp6/8/P2p4/8/2q5/3q2K1 w - - 4 52" } 0-1 Rd8# { "3r3K/4q2p/6pk/5p2/8/8/8/8 w - - 15 70" } 0-1 Qxg7# { "rn3rk1/1p3pQp/p2q4/2pPp3/4N3/7P/PP1K4/R5R1 b - - 0 25" } 1-0 Qxc1# { "1kr5/1p4b1/p2p2b1/3Pp1N1/4P3/1P1B1P2/P6Q/1Kq5 w - - 0 33" } 0-1
The next step is to get rid of the extra bits.
cat matefen | cut -d'{' -f2| cut -d'}' -f1 | tr -d '"' |cut -d' ' -f2,3 | grep "\S" > slim-mate.fen
What happens here if that we cut between the {}
to get the FEN, then remove the "
, then split again on a space and get the second and third columns. Then we get rid of the extra empty lines (You could also chain all of these commands together if you wanted).
After this step the file we have is
dendiz@minipc-GN41:~/app/jupyterhub/notebooks/Blog/files$ head slim-mate.fen 8/q7/8/8/8/8/K1k5/8 w 1k6/8/pp6/8/P2p4/8/2q5/3q2K1 w 3r3K/4q2p/6pk/5p2/8/8/8/8 w rn3rk1/1p3pQp/p2q4/2pPp3/4N3/7P/PP1K4/R5R1 b
This contains all the information we need (the side being checkmated and the state of the board) to start building a histogram of the squares the kings were checkmated.
import chess.pgn import chess import numpy as np import matplotlib import matplotlib.pyplot as plt import math with open("slim-mate.fen","r") as infile: cols = list("abcdefgh") crows = list("87654321") histo = [0] * 64 lines = infile.read().splitlines() for line in lines[:1000000]: king = "K" if line[-1] == "w" else "k" ## line is ## 1k6/8/pp6/8/P2p4/8/2q5/3q2K1 w fen = line[:-2] rows = fen.split("/") for idx, row in enumerate(rows): if king in row: col = 0 for c in row: if c.isnumeric(): col += int(c) elif c == king: hist_idx = 8 * idx + col histo[hist_idx] += 1 else: col += 1 ## normalize the values for the heat map maxl = max(histo) histo = [x/maxl for x in histo] histo = [math.floor(100*x)/100 for x in histo] ## create 8x8 matrix from the flat list b = [histo[n:n+8] for n in range(0, len(histo), 8)] b.reverse() b = np.array(b) fig, ax = plt.subplots() ax.set_xticks(np.arange(8)) ax.set_yticks(np.arange(8)) ax.set_xticklabels(cols) ax.set_yticklabels(crows) im = ax.imshow(b) for i in range(8): for j in range(8): text = ax.text(j, i, b[i, j], ha="center", va="center", color="w") fig.tight_layout() plt.show()
542 words
Powered by TF-IDF/Cosine similarity
First published on 2021-02-20
Generated on Oct 9, 2024, 4:11 AM
Mobile optimized version. Desktop version.