<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[Untitled Publication]]></title><description><![CDATA[Untitled Publication]]></description><link>https://blog.gkomninos.com</link><generator>RSS for Node</generator><lastBuildDate>Tue, 12 May 2026 22:10:25 GMT</lastBuildDate><atom:link href="https://blog.gkomninos.com/rss.xml" rel="self" type="application/rss+xml"/><language><![CDATA[en]]></language><ttl>60</ttl><item><title><![CDATA[How to Create a SOCKS5 Proxy with VPN Tunneling Using Docker]]></title><description><![CDATA[This guide shows you how to set up a SOCKS5 proxy that routes all traffic through a VPN connection using Docker Compose. This setup provides an extra layer of privacy for your internet traffic.
Prerequisites

Docker and Docker Compose installed

A VP...]]></description><link>https://blog.gkomninos.com/how-to-create-a-socks5-proxy-with-vpn-tunneling-using-docker</link><guid isPermaLink="true">https://blog.gkomninos.com/how-to-create-a-socks5-proxy-with-vpn-tunneling-using-docker</guid><category><![CDATA[networking]]></category><category><![CDATA[Scraping]]></category><category><![CDATA[Docker compose]]></category><category><![CDATA[vpn]]></category><category><![CDATA[proxy]]></category><dc:creator><![CDATA[Georgios Komninos]]></dc:creator><pubDate>Sat, 10 May 2025 05:29:46 GMT</pubDate><content:encoded><![CDATA[<p>This guide shows you how to set up a SOCKS5 proxy that routes all traffic through a VPN connection using Docker Compose. This setup provides an extra layer of privacy for your internet traffic.</p>
<h2 id="heading-prerequisites">Prerequisites</h2>
<ul>
<li><p>Docker and Docker Compose installed</p>
</li>
<li><p>A VPN subscription (this example uses NordVPN but works with other providers)</p>
</li>
</ul>
<h2 id="heading-step-1-create-project-structure">Step 1: Create Project Structure</h2>
<p>First, create a directory for your project and a subdirectory for your VPN configuration:</p>
<pre><code class="lang-bash">mkdir -p proxy-vpn/vpn1
<span class="hljs-built_in">cd</span> proxy-vpn
</code></pre>
<h2 id="heading-step-2-create-the-docker-compose-file">Step 2: Create the Docker Compose File</h2>
<p>Create a file named <code>docker-compose.yml</code> in the project directory and paste the following configuration:</p>
<pre><code class="lang-dockerfile">services:
  vpn1:
    image: dperson/openvpn-client:latest
    container_name: vpn-client1
    cap_add:
      - NET_ADMIN
    devices:
      - /dev/net/tun
    restart: unless-stopped
    volumes:
      - ./vpn1:/vpn
    environment:
      - PUID=<span class="hljs-number">1000</span>
      - PGID=<span class="hljs-number">1000</span>
    command: [<span class="hljs-string">"-f"</span>, <span class="hljs-string">'""'</span>, <span class="hljs-string">"-a"</span>, <span class="hljs-string">"${VPN_USERNAME};${VPN_PASSWORD}"</span>]
    networks:
      - vpn_network1
    ports:
      - <span class="hljs-string">"1080:1080"</span>
    dns:
      - <span class="hljs-number">8.8</span>.<span class="hljs-number">8.8</span>
      - <span class="hljs-number">8.8</span>.<span class="hljs-number">4.4</span>
    sysctls:
      - net.ipv6.conf.all.disable_ipv6=<span class="hljs-number">1</span>

  socks-proxy1:
    image: serjs/go-socks5-proxy:latest
    container_name: socks5-proxy1
    restart: unless-stopped
    environment:
      - PROXY_USER=${SOCKS_USER}
      - PROXY_PASSWORD=${SOCKS_PASSWORD}
    network_mode: <span class="hljs-string">"service:vpn1"</span>
    depends_on:
      - vpn1
    sysctls:
      - net.ipv6.conf.all.disable_ipv6=<span class="hljs-number">1</span>

networks:
  vpn_network1:
    driver: bridge
</code></pre>
<h2 id="heading-step-3-set-up-vpn-configuration">Step 3: Set Up VPN Configuration</h2>
<ol>
<li><p>Download your VPN configuration file from your provider (for NordVPN, visit <a target="_blank" href="https://nordvpn.com/blog/nordvpn-config-files/">https://nordvpn.com/blog/nordvpn-config-files/</a>)</p>
</li>
<li><p>Save the configuration file as <code>vpn.conf</code> in the <code>vpn1</code> directory</p>
</li>
</ol>
<p>Here's a sample of what a NordVPN configuration looks like (replace SERVER-IP with the server of your choice - pick one from the configs you downloaded)</p>
<pre><code class="lang-plaintext">client
dev tun
proto udp
remote &lt;SERVER-IP&gt; 1194
resolv-retry infinite
remote-random
nobind
tun-mtu 1500
tun-mtu-extra 32
mssfix 1450
persist-key
persist-tun
ping 15
ping-restart 0
ping-timer-rem
reneg-sec 0
comp-lzo no
verify-x509-name CN=es137.nordvpn.com

remote-cert-tls server

auth-user-pass
verb 3
pull
fast-io
cipher AES-256-CBC
auth SHA512
&lt;ca&gt;
-----BEGIN CERTIFICATE-----
MIIFCjCCAvKgAwIBAgIBATANBgkqhkiG9w0BAQ0FADA5MQswCQYDVQQGEwJQQTEQ
MA4GA1UEChMHTm9yZFZQTjEYMBYGA1UEAxMPTm9yZFZQTiBSb290IENBMB4XDTE2
MDEwMTAwMDAwMFoXDTM1MTIzMTIzNTk1OVowOTELMAkGA1UEBhMCUEExEDAOBgNV
BAoTB05vcmRWUE4xGDAWBgNVBAMTD05vcmRWUE4gUm9vdCBDQTCCAiIwDQYJKoZI
hvcNAQEBBQADggIPADCCAgoCggIBAMkr/BYhyo0F2upsIMXwC6QvkZps3NN2/eQF
kfQIS1gql0aejsKsEnmY0Kaon8uZCTXPsRH1gQNgg5D2gixdd1mJUvV3dE3y9FJr
XMoDkXdCGBodvKJyU6lcfEVF6/UxHcbBguZK9UtRHS9eJYm3rpL/5huQMCppX7kU
eQ8dpCwd3iKITqwd1ZudDqsWaU0vqzC2H55IyaZ/5/TnCk31Q1UP6BksbbuRcwOV
skEDsm6YoWDnn/IIzGOYnFJRzQH5jTz3j1QBvRIuQuBuvUkfhx1FEwhwZigrcxXu
MP+QgM54kezgziJUaZcOM2zF3lvrwMvXDMfNeIoJABv9ljw969xQ8czQCU5lMVmA
37ltv5Ec9U5hZuwk/9QO1Z+d/r6Jx0mlurS8gnCAKJgwa3kyZw6e4FZ8mYL4vpRR
hPdvRTWCMJkeB4yBHyhxUmTRgJHm6YR3D6hcFAc9cQcTEl/I60tMdz33G6m0O42s
Qt/+AR3YCY/RusWVBJB/qNS94EtNtj8iaebCQW1jHAhvGmFILVR9lzD0EzWKHkvy
WEjmUVRgCDd6Ne3eFRNS73gdv/C3l5boYySeu4exkEYVxVRn8DhCxs0MnkMHWFK6
MyzXCCn+JnWFDYPfDKHvpff/kLDobtPBf+Lbch5wQy9quY27xaj0XwLyjOltpiST
LWae/Q4vAgMBAAGjHTAbMAwGA1UdEwQFMAMBAf8wCwYDVR0PBAQDAgEGMA0GCSqG
SIb3DQEBDQUAA4ICAQC9fUL2sZPxIN2mD32VeNySTgZlCEdVmlq471o/bDMP4B8g
nQesFRtXY2ZCjs50Jm73B2LViL9qlREmI6vE5IC8IsRBJSV4ce1WYxyXro5rmVg/
k6a10rlsbK/eg//GHoJxDdXDOokLUSnxt7gk3QKpX6eCdh67p0PuWm/7WUJQxH2S
DxsT9vB/iZriTIEe/ILoOQF0Aqp7AgNCcLcLAmbxXQkXYCCSB35Vp06u+eTWjG0/
pyS5V14stGtw+fA0DJp5ZJV4eqJ5LqxMlYvEZ/qKTEdoCeaXv2QEmN6dVqjDoTAo
k0t5u4YRXzEVCfXAC3ocplNdtCA72wjFJcSbfif4BSC8bDACTXtnPC7nD0VndZLp
+RiNLeiENhk0oTC+UVdSc+n2nJOzkCK0vYu0Ads4JGIB7g8IB3z2t9ICmsWrgnhd
NdcOe15BincrGA8avQ1cWXsfIKEjbrnEuEk9b5jel6NfHtPKoHc9mDpRdNPISeVa
wDBM1mJChneHt59Nh8Gah74+TM1jBsw4fhJPvoc7Atcg740JErb904mZfkIEmojC
VPhBHVQ9LHBAdM8qFI2kRK0IynOmAZhexlP/aT/kpEsEPyaZQlnBn3An1CRz8h0S
PApL8PytggYKeQmRhl499+6jLxcZ2IegLfqq41dzIjwHwTMplg+1pKIOVojpWA==
-----END CERTIFICATE-----
&lt;/ca&gt;
key-direction 1
&lt;tls-auth&gt;
#
# 2048 bit OpenVPN static key
#
-----BEGIN OpenVPN Static key V1-----
e685bdaf659a25a200e2b9e39e51ff03
0fc72cf1ce07232bd8b2be5e6c670143
f51e937e670eee09d4f2ea5a6e4e6996
5db852c275351b86fc4ca892d78ae002
d6f70d029bd79c4d1c26cf14e9588033
cf639f8a74809f29f72b9d58f9b8f5fe
fc7938eade40e9fed6cb92184abb2cc1
0eb1a296df243b251df0643d53724cdb
5a92a1d6cb817804c4a9319b57d53be5
80815bcfcb2df55018cc83fc43bc7ff8
2d51f9b88364776ee9d12fc85cc7ea5b
9741c4f598c485316db066d52db4540e
212e1518a9bd4828219e24b20d88f598
a196c9de96012090e333519ae18d3509
9427e7b372d348d352dc4c85e18cd4b9
3f8a56ddb2e64eb67adfc9b337157ff4
-----END OpenVPN Static key V1-----
&lt;/tls-auth&gt;
</code></pre>
<h2 id="heading-step-4-create-environment-variables-file">Step 4: Create Environment Variables File</h2>
<p>Create a <code>.env</code> file in the project directory with your credentials:</p>
<pre><code class="lang-plaintext">VPN_USERNAME=username
VPN_PASSWORD=password
#SOCKS_USER=proxyuser # leave commented for no credentials in the socks proxy
#SOCKS_PASSWORD=proxypass # leave commented for no credentials in the socks proxy
</code></pre>
<h2 id="heading-step-5-start-the-services">Step 5: Start the Services</h2>
<pre><code class="lang-bash">docker-compose up -d
</code></pre>
<h2 id="heading-step-6-test-the-connection">Step 6: Test the Connection</h2>
<pre><code class="lang-bash">curl --socks5 localhost:1080 ifconfig.me
</code></pre>
<p>This will display your VPN's IP address rather than your actual IP.</p>
<h2 id="heading-using-your-socks5-proxy">Using Your SOCKS5 Proxy</h2>
<p>You can now configure applications to use your SOCKS5 proxy with these details:</p>
<ul>
<li><p>Proxy host: <a target="_blank" href="http://localhost">localhost</a> (or your server's IP)</p>
</li>
<li><p>Port: 1080</p>
</li>
<li><p>Username: proxyuser (or whatever you set in the .env file) (if you set this in .env - otherwise empty)</p>
</li>
<li><p>Password: proxypass (or whatever you set in the .env file) (if you set this in .env - otherwise empty)</p>
</li>
</ul>
]]></content:encoded></item><item><title><![CDATA[How to generate PDF in Go]]></title><description><![CDATA[In this article I am going to show you one way of generating PDF documents using Go. Generating a PDF document is a common need in many applications.
We are building an invoice management application and one of the requirements is the to generate a P...]]></description><link>https://blog.gkomninos.com/how-to-generate-pdf-in-go</link><guid isPermaLink="true">https://blog.gkomninos.com/how-to-generate-pdf-in-go</guid><category><![CDATA[Programming Blogs]]></category><category><![CDATA[golang]]></category><category><![CDATA[Go Language]]></category><category><![CDATA[Web Development]]></category><category><![CDATA[coding]]></category><dc:creator><![CDATA[Georgios Komninos]]></dc:creator><pubDate>Fri, 21 Jun 2024 05:00:40 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1718859068100/d1f1c1a1-1672-43d0-a1af-31584c48777f.webp" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In this article I am going to show you one way of generating PDF documents using Go. Generating a PDF document is a common need in many applications.</p>
<p>We are building an invoice management application and one of the requirements is the to generate a PDF.</p>
<p>This blog post is part of the <a target="_blank" href="https://blog.gkomninos.com/series/webapp-using-golang"><strong>Building a Web App with Golang</strong></a> .</p>
<h2 id="heading-from-html-to-pdf">From HTML to PDF</h2>
<p>In order to create a PDF document we will first create an HTML document and use the <a target="_blank" href="https://doc.courtbouillon.org/weasyprint/stable/">weasyprint</a> tool to generate the PDF from the HTML .</p>
<p>There are some golang libraries out there but I found that weasyprint works very nice and haven't run into any issue yet.</p>
<p>Let's first install weasyprint in our machine and see how it works.</p>
<p>Please see the detailed instructions in weasyprint's <a target="_blank" href="https://doc.courtbouillon.org/weasyprint/stable/first_steps.html">documentation</a> .</p>
<p>Here is how you can install by utilizing python PIP:</p>
<pre><code class="lang-bash">pip install weasyprint
</code></pre>
<p>In the documentation there are instructions for many operating systems. Please check there if you run into issues with the above.</p>
<p>Now let's try a demo to verify that it works.</p>
<h3 id="heading-writing-our-pdf-generator">Writing our pdf generator</h3>
<p>First let's create a new git branch</p>
<pre><code class="lang-bash">git checkout -b invoice-generation
</code></pre>
<p>Then create a folder named <code>templates</code> .</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1718860309533/4aaa155c-ef70-4fd4-b87a-7d32c89db745.png" alt class="image--center mx-auto" /></p>
<p>Ins</p>
<p>Create an HTML file named invoice.pdf.html</p>
<pre><code class="lang-xml"><span class="hljs-meta">&lt;!DOCTYPE <span class="hljs-meta-keyword">html</span>&gt;</span>
<span class="hljs-tag">&lt;<span class="hljs-name">html</span> <span class="hljs-attr">lang</span>=<span class="hljs-string">"en"</span>&gt;</span>
<span class="hljs-tag">&lt;<span class="hljs-name">head</span>&gt;</span>
    <span class="hljs-tag">&lt;<span class="hljs-name">meta</span> <span class="hljs-attr">charset</span>=<span class="hljs-string">"UTF-8"</span>&gt;</span>
    <span class="hljs-tag">&lt;<span class="hljs-name">meta</span> <span class="hljs-attr">name</span>=<span class="hljs-string">"viewport"</span> <span class="hljs-attr">content</span>=<span class="hljs-string">"width=device-width, initial-scale=1.0"</span>&gt;</span>
    <span class="hljs-tag">&lt;<span class="hljs-name">title</span>&gt;</span>Invoice 1001/24<span class="hljs-tag">&lt;/<span class="hljs-name">title</span>&gt;</span>
    <span class="hljs-tag">&lt;<span class="hljs-name">style</span>&gt;</span><span class="css">
        <span class="hljs-selector-tag">body</span> {
            <span class="hljs-attribute">font-family</span>: Arial, sans-serif;
            <span class="hljs-attribute">margin</span>: <span class="hljs-number">20px</span>;
            <span class="hljs-attribute">display</span>: flex;
            <span class="hljs-attribute">justify-content</span>: center;
        }
        <span class="hljs-selector-class">.container</span> {
            <span class="hljs-attribute">width</span>: <span class="hljs-number">80%</span>;
            <span class="hljs-attribute">max-width</span>: <span class="hljs-number">800px</span>;
        }
        <span class="hljs-selector-class">.header</span> {
            <span class="hljs-attribute">display</span>: flex;
            <span class="hljs-attribute">justify-content</span>: space-between;
            <span class="hljs-attribute">margin-bottom</span>: <span class="hljs-number">20px</span>;
        }
        <span class="hljs-selector-class">.seller</span> {
            <span class="hljs-attribute">text-align</span>: right;
        }
        <span class="hljs-selector-class">.invoice-info</span> {
            <span class="hljs-attribute">text-align</span>: right;
            <span class="hljs-attribute">margin-bottom</span>: <span class="hljs-number">20px</span>;
        }
        <span class="hljs-selector-class">.buyer</span> {
            <span class="hljs-attribute">margin-top</span>: <span class="hljs-number">20px</span>;
        }
        <span class="hljs-selector-class">.line-items</span> {
            <span class="hljs-attribute">margin-bottom</span>: <span class="hljs-number">20px</span>;
        }
        <span class="hljs-selector-class">.line-items</span> <span class="hljs-selector-tag">table</span> {
            <span class="hljs-attribute">width</span>: <span class="hljs-number">100%</span>;
            <span class="hljs-attribute">border-collapse</span>: collapse;
        }
        <span class="hljs-selector-class">.line-items</span> <span class="hljs-selector-tag">th</span>, <span class="hljs-selector-class">.line-items</span> <span class="hljs-selector-tag">td</span> {
            <span class="hljs-attribute">border</span>: <span class="hljs-number">1px</span> solid <span class="hljs-number">#000</span>;
            <span class="hljs-attribute">padding</span>: <span class="hljs-number">8px</span>;
            <span class="hljs-attribute">text-align</span>: left;
        }
        <span class="hljs-selector-class">.totals</span> {
            <span class="hljs-attribute">margin-bottom</span>: <span class="hljs-number">20px</span>;
        }
        <span class="hljs-selector-class">.totals</span> <span class="hljs-selector-tag">table</span> {
            <span class="hljs-attribute">width</span>: <span class="hljs-number">100%</span>;
            <span class="hljs-attribute">border-collapse</span>: collapse;
        }
        <span class="hljs-selector-class">.totals</span> <span class="hljs-selector-tag">td</span> {
            <span class="hljs-attribute">padding</span>: <span class="hljs-number">8px</span>;
            <span class="hljs-attribute">text-align</span>: right;
        }
        <span class="hljs-selector-class">.totals</span> <span class="hljs-selector-class">.total</span> {
            <span class="hljs-attribute">font-weight</span>: bold;
        }
        <span class="hljs-selector-class">.footer</span> {
            <span class="hljs-attribute">border-top</span>: <span class="hljs-number">1px</span> solid <span class="hljs-number">#000</span>;
            <span class="hljs-attribute">padding-top</span>: <span class="hljs-number">20px</span>;
            <span class="hljs-attribute">text-align</span>: left;
        }
    </span><span class="hljs-tag">&lt;/<span class="hljs-name">style</span>&gt;</span>
<span class="hljs-tag">&lt;/<span class="hljs-name">head</span>&gt;</span>
<span class="hljs-tag">&lt;<span class="hljs-name">body</span>&gt;</span>

    <span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">class</span>=<span class="hljs-string">"container"</span>&gt;</span>
        <span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">class</span>=<span class="hljs-string">"header"</span>&gt;</span>
            <span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">class</span>=<span class="hljs-string">"buyer"</span>&gt;</span>
                <span class="hljs-tag">&lt;<span class="hljs-name">h3</span>&gt;</span>Bill To:<span class="hljs-tag">&lt;/<span class="hljs-name">h3</span>&gt;</span>
                <span class="hljs-tag">&lt;<span class="hljs-name">p</span>&gt;</span>Buyer LTD<span class="hljs-tag">&lt;/<span class="hljs-name">p</span>&gt;</span>
                <span class="hljs-tag">&lt;<span class="hljs-name">p</span>&gt;</span>Buyer Address<span class="hljs-tag">&lt;/<span class="hljs-name">p</span>&gt;</span>
                <span class="hljs-tag">&lt;<span class="hljs-name">p</span>&gt;</span>VAT: Vat-number<span class="hljs-tag">&lt;/<span class="hljs-name">p</span>&gt;</span>
            <span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span>
            <span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">class</span>=<span class="hljs-string">"seller"</span>&gt;</span>
                <span class="hljs-tag">&lt;<span class="hljs-name">h2</span>&gt;</span>Seller<span class="hljs-tag">&lt;/<span class="hljs-name">h2</span>&gt;</span>
                <span class="hljs-tag">&lt;<span class="hljs-name">p</span>&gt;</span>Seller Address<span class="hljs-tag">&lt;/<span class="hljs-name">p</span>&gt;</span>
                <span class="hljs-tag">&lt;<span class="hljs-name">p</span>&gt;</span>Seller Address2<span class="hljs-tag">&lt;/<span class="hljs-name">p</span>&gt;</span>
                <span class="hljs-tag">&lt;<span class="hljs-name">p</span>&gt;</span>T.I.C. No: Seller Tax Number<span class="hljs-tag">&lt;/<span class="hljs-name">p</span>&gt;</span>
                <span class="hljs-tag">&lt;<span class="hljs-name">p</span>&gt;</span>V.A.T. No: Seller Vat Number<span class="hljs-tag">&lt;/<span class="hljs-name">p</span>&gt;</span>
            <span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span>
        <span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span>

        <span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">class</span>=<span class="hljs-string">"invoice-info"</span>&gt;</span>
            <span class="hljs-tag">&lt;<span class="hljs-name">h2</span>&gt;</span>Tax Invoice No.: 1001/24<span class="hljs-tag">&lt;/<span class="hljs-name">h2</span>&gt;</span>
            <span class="hljs-tag">&lt;<span class="hljs-name">p</span>&gt;</span>Invoice Date: 02/01/2024<span class="hljs-tag">&lt;/<span class="hljs-name">p</span>&gt;</span>
        <span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span>

        <span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">class</span>=<span class="hljs-string">"line-items"</span>&gt;</span>
            <span class="hljs-tag">&lt;<span class="hljs-name">table</span>&gt;</span>
                <span class="hljs-tag">&lt;<span class="hljs-name">tr</span>&gt;</span>
                    <span class="hljs-tag">&lt;<span class="hljs-name">th</span>&gt;</span>Description<span class="hljs-tag">&lt;/<span class="hljs-name">th</span>&gt;</span>
                    <span class="hljs-tag">&lt;<span class="hljs-name">th</span>&gt;</span>Amount (€)<span class="hljs-tag">&lt;/<span class="hljs-name">th</span>&gt;</span>
                <span class="hljs-tag">&lt;/<span class="hljs-name">tr</span>&gt;</span>
                <span class="hljs-tag">&lt;<span class="hljs-name">tr</span>&gt;</span>
                    <span class="hljs-tag">&lt;<span class="hljs-name">td</span>&gt;</span>Service Description<span class="hljs-tag">&lt;/<span class="hljs-name">td</span>&gt;</span>
                    <span class="hljs-tag">&lt;<span class="hljs-name">td</span>&gt;</span>1,000.00<span class="hljs-tag">&lt;/<span class="hljs-name">td</span>&gt;</span>
                <span class="hljs-tag">&lt;/<span class="hljs-name">tr</span>&gt;</span>
            <span class="hljs-tag">&lt;/<span class="hljs-name">table</span>&gt;</span>
        <span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span>

        <span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">class</span>=<span class="hljs-string">"totals"</span>&gt;</span>
            <span class="hljs-tag">&lt;<span class="hljs-name">table</span>&gt;</span>
                <span class="hljs-tag">&lt;<span class="hljs-name">tr</span>&gt;</span>
                    <span class="hljs-tag">&lt;<span class="hljs-name">td</span>&gt;</span>Fees<span class="hljs-tag">&lt;/<span class="hljs-name">td</span>&gt;</span>
                    <span class="hljs-tag">&lt;<span class="hljs-name">td</span>&gt;</span>€1,000.00<span class="hljs-tag">&lt;/<span class="hljs-name">td</span>&gt;</span>
                <span class="hljs-tag">&lt;/<span class="hljs-name">tr</span>&gt;</span>
                <span class="hljs-tag">&lt;<span class="hljs-name">tr</span>&gt;</span>
                    <span class="hljs-tag">&lt;<span class="hljs-name">td</span>&gt;</span>VAT<span class="hljs-tag">&lt;/<span class="hljs-name">td</span>&gt;</span>
                    <span class="hljs-tag">&lt;<span class="hljs-name">td</span>&gt;</span>€0<span class="hljs-tag">&lt;/<span class="hljs-name">td</span>&gt;</span>
                <span class="hljs-tag">&lt;/<span class="hljs-name">tr</span>&gt;</span>
                <span class="hljs-tag">&lt;<span class="hljs-name">tr</span> <span class="hljs-attr">class</span>=<span class="hljs-string">"total"</span>&gt;</span>
                    <span class="hljs-tag">&lt;<span class="hljs-name">td</span>&gt;</span>Total<span class="hljs-tag">&lt;/<span class="hljs-name">td</span>&gt;</span>
                    <span class="hljs-tag">&lt;<span class="hljs-name">td</span>&gt;</span>€1,000.00<span class="hljs-tag">&lt;/<span class="hljs-name">td</span>&gt;</span>
                <span class="hljs-tag">&lt;/<span class="hljs-name">tr</span>&gt;</span>
            <span class="hljs-tag">&lt;/<span class="hljs-name">table</span>&gt;</span>
        <span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span>

        <span class="hljs-tag">&lt;<span class="hljs-name">div</span> <span class="hljs-attr">class</span>=<span class="hljs-string">"footer"</span>&gt;</span>
            <span class="hljs-tag">&lt;<span class="hljs-name">h3</span>&gt;</span>Invoice Payable within 14 days to:<span class="hljs-tag">&lt;/<span class="hljs-name">h3</span>&gt;</span>
            <span class="hljs-tag">&lt;<span class="hljs-name">p</span>&gt;</span>Bank Name<span class="hljs-tag">&lt;/<span class="hljs-name">p</span>&gt;</span>
            <span class="hljs-tag">&lt;<span class="hljs-name">p</span>&gt;</span>IBAN: IBAN<span class="hljs-tag">&lt;/<span class="hljs-name">p</span>&gt;</span>
            <span class="hljs-tag">&lt;<span class="hljs-name">p</span>&gt;</span>BIC: Bic Number<span class="hljs-tag">&lt;/<span class="hljs-name">p</span>&gt;</span>
            <span class="hljs-tag">&lt;<span class="hljs-name">p</span>&gt;</span>Account Owner Name<span class="hljs-tag">&lt;/<span class="hljs-name">p</span>&gt;</span>
        <span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span>
    <span class="hljs-tag">&lt;/<span class="hljs-name">div</span>&gt;</span>

<span class="hljs-tag">&lt;/<span class="hljs-name">body</span>&gt;</span>
<span class="hljs-tag">&lt;/<span class="hljs-name">html</span>&gt;</span>
</code></pre>
<p><strong>DISCLAIMER: I used AI (ChatGPT) to assist creating the HTML</strong></p>
<p>Let's see how it looks:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1718861159803/b1a17e02-84b7-4ffe-b40e-29a5482fb9c2.png" alt class="image--center mx-auto" /></p>
<p>The HTML and CSS needs some polishing, but let's continue.</p>
<p>Now we try to create a pdf from the command line to verify that weasyprint works .</p>
<pre><code class="lang-bash">weasyprint templates/invoice.pdf.html invoice.pdf
</code></pre>
<p>this will create a pdf document. It looks like that:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1718861306216/f2e369a3-acf5-42b3-ad67-e36912d98946.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-creating-a-go-package-that-wraps-the-weasyprint-command">Creating a Go package that wraps the weasyprint command</h3>
<p>Create a folder <code>pkg/pdfgen</code></p>
<pre><code class="lang-bash">mkdir -p pkg/pdfgen
</code></pre>
<p>and add a file <code>pdfgen.go</code></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1718861459795/ec42d453-8d67-4ebe-bdc0-962b1d398a8a.png" alt class="image--center mx-auto" /></p>
<p>Now in our pdfgen.go:</p>
<pre><code class="lang-go"><span class="hljs-keyword">package</span> pdfgen

<span class="hljs-keyword">import</span> (
    <span class="hljs-string">"context"</span>
    <span class="hljs-string">"io"</span>
    <span class="hljs-string">"os"</span>
    <span class="hljs-string">"os/exec"</span>
)

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">Generate</span><span class="hljs-params">(ctx context.Context, w io.Writer, html []<span class="hljs-keyword">byte</span>)</span> <span class="hljs-title">error</span></span> {
    tempHTMLFile, err := os.CreateTemp(<span class="hljs-string">""</span>, <span class="hljs-string">"*.html"</span>)
    <span class="hljs-keyword">if</span> err != <span class="hljs-literal">nil</span> {
        <span class="hljs-keyword">return</span> err
    }

    <span class="hljs-keyword">defer</span> os.Remove(tempHTMLFile.Name())

    <span class="hljs-keyword">if</span> _, err = tempHTMLFile.Write(html); err != <span class="hljs-literal">nil</span> {
        <span class="hljs-keyword">return</span> err
    }

    <span class="hljs-keyword">if</span> err = tempHTMLFile.Close(); err != <span class="hljs-literal">nil</span> {
        <span class="hljs-keyword">return</span> err
    }

    tempPDFFile, err := os.CreateTemp(<span class="hljs-string">""</span>, <span class="hljs-string">"*.pdf"</span>)
    <span class="hljs-keyword">if</span> err != <span class="hljs-literal">nil</span> {
        <span class="hljs-keyword">return</span> err
    }

    tempPDFFilePath := tempPDFFile.Name()
    <span class="hljs-keyword">defer</span> os.Remove(tempPDFFilePath)

    <span class="hljs-keyword">if</span> err = tempPDFFile.Close(); err != <span class="hljs-literal">nil</span> {
        <span class="hljs-keyword">return</span> err
    }

    cmd := exec.CommandContext(ctx, <span class="hljs-string">"weasyprint"</span>, tempHTMLFile.Name(), tempPDFFilePath)

    err = cmd.Run()

    <span class="hljs-keyword">if</span> err != <span class="hljs-literal">nil</span> {
        <span class="hljs-keyword">return</span> err
    }

    pdfFile, err := os.Open(tempPDFFilePath)
    <span class="hljs-keyword">if</span> err != <span class="hljs-literal">nil</span> {
        <span class="hljs-keyword">return</span> err
    }

    <span class="hljs-keyword">defer</span> pdfFile.Close()

    <span class="hljs-keyword">if</span> _, err = io.Copy(w, pdfFile); err != <span class="hljs-literal">nil</span> {
        <span class="hljs-keyword">return</span> err
    }

    <span class="hljs-keyword">return</span> <span class="hljs-literal">nil</span>
}
</code></pre>
<p>Let's explain what we did:</p>
<p>the function <code>Generate</code> accepts as the context and</p>
<p>and io.Writer and a slice of bytes.</p>
<ul>
<li>The slice of bytes contains the HTML we want to convert to PDF</li>
</ul>
<p>The io.Writer is the "place" that we are going to write the contents of the PDF</p>
<blockquote>
<p>We chose to use an <code>io.Writer</code> since this will make our package for versatile. We can pass in a file an http.ResponseWriter or everything that implements that interface</p>
</blockquote>
<p>The above function works as follows:</p>
<ol>
<li><p>creates a temporary file and creates a temporary file with our html contents</p>
</li>
<li><p>creates another temporary file (just to get the name) and removes it</p>
</li>
<li><p>invokes the weasyprint command</p>
</li>
<li><p>reads the file that weasyprint created (the PDF) and copies the contents to the io.Writer we passed as argument</p>
</li>
</ol>
<blockquote>
<p>Step 2 might be replaced by creating a random string. But then we have to compute the temporary directory which is not the same for all systems or configurations. Since Go handles these for us we just (ab)use that .</p>
</blockquote>
<p>Now let's write a unit test.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1718862162980/2867e7c9-7087-47ab-91ab-e7a3e302b6a5.png" alt class="image--center mx-auto" /></p>
<p>let's run our tests:</p>
<pre><code class="lang-bash">go <span class="hljs-built_in">test</span> -v ./...
</code></pre>
<hr />
<p>Let's commit our work so far</p>
<pre><code class="lang-bash">git add pkg/ templates/
git commit -m <span class="hljs-string">"library that generates pdf from html"</span>
</code></pre>
<blockquote>
<p>We haven't used our new package in our freelance invoice management project yet. We only wrote a library and we will utilize it in the next article</p>
</blockquote>
<p>Find the code in the article's branch in <a target="_blank" href="https://github.com/gosom/freelance-invoice-hub/tree/invoice-generation">github</a></p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>In this article we created a small go package that will help to generate PDFs in Golang. Basically, we just wrote a wrapper over a command line tool that is available in most platforms.</p>
<p>I found that creating PDF this way works very well in Go. There are some Go libraries like <a target="_blank" href="https://github.com/pdfcpu/pdfcpu">pdfcpu</a> but for the purpose of creating a PDF out of an HTML I found that weasyprint works very well.</p>
<p>In the next article we will integrate our newly created pdfgen package in our application in order to generate PDFs from our invoices.</p>
<p>❤️ Follow me on <a target="_blank" href="https://x.com/gkomdev">X</a> or <a target="_blank" href="http://www.linkedin.com/in/georgios-komninos-172508147">LinkedIn</a></p>
<div data-node-type="callout">
<div data-node-type="callout-emoji">💡</div>
<div data-node-type="callout-text">Do you know a better way to generate PDF in Go?</div>
</div>]]></content:encoded></item><item><title><![CDATA[Crafting a Web app in Golang: Implementing the Service Layer]]></title><description><![CDATA[Tutorial: golang web application development
Hello all, today we will continue the development of our web application by implementing the part of invoice management.
We are going to implement the InvoiceService interface. We will learn also how we ca...]]></description><link>https://blog.gkomninos.com/crafting-a-web-app-in-golang-implementing-the-service-layer</link><guid isPermaLink="true">https://blog.gkomninos.com/crafting-a-web-app-in-golang-implementing-the-service-layer</guid><category><![CDATA[Programming Blogs]]></category><category><![CDATA[Go Language]]></category><category><![CDATA[coding]]></category><category><![CDATA[Web Development]]></category><category><![CDATA[golang]]></category><dc:creator><![CDATA[Georgios Komninos]]></dc:creator><pubDate>Thu, 20 Jun 2024 05:00:20 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1718775235194/a064fca3-1f05-40fe-8eb5-35032ddebb34.webp" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Tutorial: golang web application development</p>
<p>Hello all, today we will continue the development of our <a target="_blank" href="https://blog.gkomninos.com/series/webapp-using-golang">web application</a> by implementing the part of invoice management.</p>
<p>We are going to implement the <code>InvoiceService</code> interface. We will learn also how we can use mocks to unit test our service.</p>
<h2 id="heading-the-interface">The interface</h2>
<p>For managing the invoices we have defined the following interface :</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1718775623564/4ce2245c-9c66-4499-b6a0-ca8e23bca4da.png" alt class="image--center mx-auto" /></p>
<p>The interface has 3 methods:</p>
<ul>
<li><p>Create: this method is responsible to create a new invoice</p>
</li>
<li><p>Get: returns the invoice with id</p>
</li>
<li><p>CreatePDF: Given an invoice id returns a PDF document</p>
</li>
</ul>
<h3 id="heading-lets-get-started">Let's get started</h3>
<p>Create a new git branch</p>
<pre><code class="lang-bash">git checkout -b invoice-svc
</code></pre>
<p>Create a new package named invoices</p>
<pre><code class="lang-bash">mkdir invoices
</code></pre>
<p>and a file <code>invoices/invoices.go</code> .</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1718776220078/7e66a43f-255b-4c5c-8ab0-876a3ee93276.png" alt class="image--center mx-auto" /></p>
<p>and add the following contents in the newly created <code>invoices.go</code> .</p>
<pre><code class="lang-go"><span class="hljs-keyword">package</span> invoices

<span class="hljs-keyword">import</span> (
    <span class="hljs-string">"context"</span>
    <span class="hljs-string">"invoicehub"</span>
)

<span class="hljs-keyword">var</span> _ invoicehub.InvoiceService = (*invoiceSvc)(<span class="hljs-literal">nil</span>)

<span class="hljs-keyword">type</span> invoiceSvc <span class="hljs-keyword">struct</span> {
    repo invoicehub.InvoiceRepository
}

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">New</span><span class="hljs-params">(repo invoicehub.InvoiceRepository)</span> <span class="hljs-title">invoicehub</span>.<span class="hljs-title">InvoiceService</span></span> {
    <span class="hljs-keyword">return</span> &amp;invoiceSvc{repo}
}

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-params">(svc *invoiceSvc)</span> <span class="hljs-title">Create</span><span class="hljs-params">(ctx context.Context, invoice *invoicehub.Invoice)</span> <span class="hljs-title">error</span></span> {
    <span class="hljs-keyword">return</span> <span class="hljs-literal">nil</span>
}

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-params">(svc *invoiceSvc)</span> <span class="hljs-title">Get</span><span class="hljs-params">(ctx context.Context, id <span class="hljs-keyword">int</span>)</span> <span class="hljs-params">(invoicehub.Invoice, error)</span></span> {
    <span class="hljs-keyword">return</span> invoicehub.Invoice{}, <span class="hljs-literal">nil</span>
}

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-params">(svc *invoiceSvc)</span> <span class="hljs-title">CreatePDF</span><span class="hljs-params">(ctx context.Context, id <span class="hljs-keyword">int</span>)</span> <span class="hljs-params">([]<span class="hljs-keyword">byte</span>, error)</span></span> {
    <span class="hljs-keyword">return</span> <span class="hljs-literal">nil</span>, <span class="hljs-literal">nil</span>
}
</code></pre>
<p>We haven't done much here actualy, just created a struct that implements the interface.</p>
<p>Let's now create the a file that we will add our tests:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1718776451255/484351d4-3831-49ed-81b7-03791b54d918.png" alt class="image--center mx-auto" /></p>
<p>and let's add placeholder for our tests:</p>
<pre><code class="lang-go"><span class="hljs-keyword">package</span> invoices_test

<span class="hljs-keyword">import</span> <span class="hljs-string">"testing"</span>

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">Test_New</span><span class="hljs-params">(t *testing.T)</span></span> {
}

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">Test_invoiceSvc_Create</span><span class="hljs-params">(t *testing.T)</span></span> {
}

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">Test_invoiceSvc_Get</span><span class="hljs-params">(t *testing.T)</span></span> {
}

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">Test_invoiceSvc_CreatePDF</span><span class="hljs-params">(t *testing.T)</span></span> {
}
</code></pre>
<h3 id="heading-how-to-use-mocks">How to use mocks</h3>
<p>In order to write our tests we need to be able to provide a "mock" implementation of the <code>InvoiceRepository</code> . This will help us to test the behavior of our implementation without having to rely on a real database.</p>
<p>When we created our project layout we created a package called <code>mocks</code> that we haven't used so far.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1718776853181/b464fa7a-8297-4e3d-85b0-d5e22ff725e4.png" alt class="image--center mx-auto" /></p>
<p>We are going to utilize the package <a target="_blank" href="https://github.com/uber-go/mock">mock</a> and <code>go generate</code> to create mocks for our interfaces.</p>
<p><strong>Setting up mock generator</strong></p>
<p>first install the required package:</p>
<pre><code class="lang-bash">go install go.uber.org/mock/mockgen@latest
go get go.uber.org/mock/mockgen/model
</code></pre>
<p>and add the following in our Makefile:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1718777210912/cc7fa621-14f0-41a4-8b5d-3bb5d3d5dbf5.png" alt class="image--center mx-auto" /></p>
<p>Now we can add a special comment line above the definition of the interfaces we want to mock.</p>
<p>The comment has the following format:</p>
<pre><code class="lang-go"><span class="hljs-comment">//go:generate mockgen -destination=mocks/mock_*.go -package=mocks . InternfaceName</span>
</code></pre>
<p>So let's open the file <code>invoices.go</code> and above the definition of the <code>InvoiceRepository</code>:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1718777434120/c39e2797-7490-4c4e-8570-84721b81d414.png" alt class="image--center mx-auto" /></p>
<p>Now run :</p>
<pre><code class="lang-bash">make gen
</code></pre>
<p>this will generate a file: <code>mocks/mock_invoice_repo.go</code></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1718777565433/53856a01-714d-4b70-9b8e-d79a4642e50d.png" alt class="image--center mx-auto" /></p>
<p>If you open the file you will see that it generated a struct with name <code>MockInvoiceRepository</code> that implments the <code>InvoiceRepository</code> interface .</p>
<p>Each time you add a method in the <code>InvoiceRepository</code> you have to run the <code>make gen</code> command to re-generate the mock implementation.</p>
<p>Now let's open the file <code>invoices/invoices_test.go` </code> and add the following:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1718778002008/07eba03a-5209-4e80-bef2-cb760d7bfbe6.png" alt class="image--center mx-auto" /></p>
<p>the above shows how we were able to initialize the a invoice service using a mock implementation of the InvoiceRepository.</p>
<h3 id="heading-unit-tests-and-implementation">Unit Tests and Implementation</h3>
<p>We will now implement our interface's methods one by one.</p>
<p>We start from the <code>Create</code> method.</p>
<p>Our create method should work as following:</p>
<p>we will pass an <code>Invoice</code> struct as an argument and this method will be responsible to :</p>
<p>calculate the <code>InvoiceNumber</code> based on the <code>IssueDate</code> and the last invoice Number of the current year. If an invoice does not exist for the current year we start with invoice number 1001 😄 .</p>
<p>Then we should save the newly created invoice in the database .</p>
<p>Let's do that :</p>
<pre><code class="lang-go"><span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-params">(svc *invoiceSvc)</span> <span class="hljs-title">Create</span><span class="hljs-params">(ctx context.Context, invoice *invoicehub.Invoice)</span> <span class="hljs-title">error</span></span> {
    issueYear := invoice.IssueDate.Year()

    lastInvoice, err := svc.repo.GetLastInvoiceForYear(ctx, issueYear)
    <span class="hljs-keyword">if</span> err != <span class="hljs-literal">nil</span> {
        <span class="hljs-keyword">return</span> err
    }

    parts := strings.Split(lastInvoice.InvoiceNumber, <span class="hljs-string">"/"</span>)
    <span class="hljs-keyword">if</span> <span class="hljs-built_in">len</span>(parts) != <span class="hljs-number">2</span> {
        <span class="hljs-keyword">return</span> errors.New(<span class="hljs-string">"invalid invoice number"</span>)
    }

    lastInvoiceNumber, err := strconv.Atoi(parts[<span class="hljs-number">0</span>])
    <span class="hljs-keyword">if</span> err != <span class="hljs-literal">nil</span> {
        <span class="hljs-keyword">return</span> err
    }

    invoice.InvoiceNumber = fmt.Sprintf(<span class="hljs-string">"%d/%d"</span>, lastInvoiceNumber+<span class="hljs-number">1</span>, issueYear)

    <span class="hljs-keyword">if</span> _, err := svc.repo.Create(ctx, invoice); err != <span class="hljs-literal">nil</span> {
        <span class="hljs-keyword">return</span> err
    }

    <span class="hljs-keyword">return</span> <span class="hljs-literal">nil</span>
}
</code></pre>
<p>Now we have to write some tests to see if this works</p>
<p>in your <code>invoices/invoices_test.go</code> :</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1718778994264/bdba7d27-451e-4852-b8d2-4175a79f6cb5.png" alt class="image--center mx-auto" /></p>
<p>Here I have added some test cases . This is not 100% covers all the cases but it's good enough for now.</p>
<p>Let's first implement the first case:</p>
<pre><code class="lang-go">    t.Run(<span class="hljs-string">"when there is no last invoice for the current year"</span>, <span class="hljs-function"><span class="hljs-keyword">func</span><span class="hljs-params">(t *testing.T)</span></span> {
        inv := &amp;invoicehub.Invoice{
            IssueDate: time.Date(<span class="hljs-number">2024</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>, time.UTC),
        }

        err := svc.Create(context.Background(), inv)
        require.NoError(t, err)
    })
</code></pre>
<p>Now let's run the test and see what it will happen:</p>
<pre><code class="lang-bash">go <span class="hljs-built_in">test</span> -v ./... -run=Test_invoiceSvc_Create/when_there_is_no_last_invoice_for_the_current_year
</code></pre>
<p>and we get:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1718779300080/3e1a0adc-a603-4b8f-8930-69438c1d629e.png" alt class="image--center mx-auto" /></p>
<p>The error tells us that our method called the method:<br /><code>*mocks.MockInvoiceRepository.GetLastInvoiceForYear</code></p>
<p>and this was not expected .</p>
<p>We need to instruct the test what the call to this method will return .</p>
<p><mark>But wait a second...What our repository returns when a record is not found in the database?</mark></p>
<p>It returns a <a target="_blank" href="https://gorm.io/docs/error_handling.html#ErrRecordNotFound">gorm.ErrNotFound</a> error . We could return that BUT this will leak implemetation details to our service layer. We have to define a custom error in our domain that we can utilize.</p>
<p>Open the <code>invoice.go</code> and add on top:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1718779870957/7471065c-62b8-41e7-9385-ffa88ac0bc00.png" alt class="image--center mx-auto" /></p>
<p>Now open the <code>sqlite/invoice.go</code> and return that error when an invoice is not found.<br />We have to do that in two places:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1718779919807/cb3ceb4a-fc2d-4b93-bfef-295ed8272a45.png" alt class="image--center mx-auto" /></p>
<p>Let's also modify the InvoiceRepository's tests to check that when an invoice is not found it returns that error:</p>
<p>In the <code>func Test_invoiceRepository(t *testing.T) {</code></p>
<p>when an invoice is not found assert that the error is of this type:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1718780026570/adc653ba-0f8f-4100-b382-41d632027090.png" alt class="image--center mx-auto" /></p>
<p>and in the bottom:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1718780081049/9b91f1ce-da36-49b9-ad1d-0a5115299933.png" alt class="image--center mx-auto" /></p>
<p>Now let's run that test to make sure it works:</p>
<pre><code class="lang-bash">go <span class="hljs-built_in">test</span> -v ./... -run=Test_invoiceRepository
</code></pre>
<p>Now we can proceed with writing the test for the Create method of InvoiceRepository:</p>
<pre><code class="lang-go">    t.Run(<span class="hljs-string">"when there is no last invoice for the current year"</span>, <span class="hljs-function"><span class="hljs-keyword">func</span><span class="hljs-params">(t *testing.T)</span></span> {
        inv := &amp;invoicehub.Invoice{
            IssueDate: time.Date(<span class="hljs-number">2024</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>, time.UTC),
        }

        repo.EXPECT().GetLastInvoiceForYear(gomock.Any(), <span class="hljs-number">2024</span>).Return(invoicehub.Invoice{}, invoicehub.ErrInvoiceNotFound)

        err := svc.Create(context.Background(), inv)
        require.NoError(t, err)
    })
</code></pre>
<p>But our test still fails:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1718780340040/aef876e5-53c3-4c83-a724-fc6f81657a0d.png" alt class="image--center mx-auto" /></p>
<p>Why is that?</p>
<p>in our implementation we do that:  </p>
<pre><code class="lang-go">    lastInvoice, err := svc.repo.GetLastInvoiceForYear(ctx, issueYear)
    <span class="hljs-keyword">if</span> err != <span class="hljs-literal">nil</span> {
        <span class="hljs-keyword">return</span> err
    }
</code></pre>
<p><strong>But, when there is no invoice in the database we return an ErrInvoiceNotFound.</strong></p>
<p>In our requirements we have that when a invoice for this year does not exist then we should generate a new invoice number starting with 1001.</p>
<p>Let's fix it:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1718780654512/c3a294f4-970a-4a3d-ab85-ec6fd43d354e.png" alt class="image--center mx-auto" /></p>
<p>So now we check for that error and create an invoiceNumber as we need.</p>
<p>But again the test still fail 😟</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1718780722528/16b96781-5e15-4a2f-83b5-159d516adfd4.png" alt class="image--center mx-auto" /></p>
<p>Ok , this is good actually . We know what is going on. We just call the repo.Create method but we don't mock it as above .</p>
<p>The final code for that test case becomes:</p>
<pre><code class="lang-go">    t.Run(<span class="hljs-string">"when there is no last invoice for the current year"</span>, <span class="hljs-function"><span class="hljs-keyword">func</span><span class="hljs-params">(t *testing.T)</span></span> {
        inv := &amp;invoicehub.Invoice{
            IssueDate: time.Date(<span class="hljs-number">2024</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>, time.UTC),
        }

        repo.EXPECT().GetLastInvoiceForYear(gomock.Any(), <span class="hljs-number">2024</span>).Return(invoicehub.Invoice{}, invoicehub.ErrInvoiceNotFound)
        repo.EXPECT().Create(gomock.Any(), inv).Return(<span class="hljs-number">1</span>, <span class="hljs-literal">nil</span>)

        err := svc.Create(context.Background(), inv)
        require.NoError(t, err)
    })
</code></pre>
<p>Now let's quickly complete the remaining test cases.<br />Our test now looks like:  </p>
<pre><code class="lang-go"><span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">Test_invoiceSvc_Create</span><span class="hljs-params">(t *testing.T)</span></span> {
    mctrl := gomock.NewController(t)
    <span class="hljs-keyword">defer</span> mctrl.Finish()

    repo := mocks.NewMockInvoiceRepository(mctrl)

    svc := invoices.New(repo)

    t.Run(<span class="hljs-string">"when there is no last invoice for the current year"</span>, <span class="hljs-function"><span class="hljs-keyword">func</span><span class="hljs-params">(t *testing.T)</span></span> {
        inv := &amp;invoicehub.Invoice{
            IssueDate: time.Date(<span class="hljs-number">2024</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>, time.UTC),
        }

        repo.EXPECT().GetLastInvoiceForYear(gomock.Any(), <span class="hljs-number">2024</span>).Return(invoicehub.Invoice{}, invoicehub.ErrInvoiceNotFound)
        repo.EXPECT().Create(gomock.Any(), inv).Return(<span class="hljs-number">1</span>, <span class="hljs-literal">nil</span>)

        err := svc.Create(context.Background(), inv)
        require.NoError(t, err)

        require.Equal(t, <span class="hljs-string">"1001/2024"</span>, inv.InvoiceNumber)
    })

    t.Run(<span class="hljs-string">"when there is a last invoice for the current year"</span>, <span class="hljs-function"><span class="hljs-keyword">func</span><span class="hljs-params">(t *testing.T)</span></span> {
        inv1 := &amp;invoicehub.Invoice{
            IssueDate:     time.Date(<span class="hljs-number">2024</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>, time.UTC),
            InvoiceNumber: <span class="hljs-string">"1001/2024"</span>,
        }

        invNew := &amp;invoicehub.Invoice{
            IssueDate: time.Date(<span class="hljs-number">2024</span>, <span class="hljs-number">2</span>, <span class="hljs-number">5</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>, time.UTC),
        }

        repo.EXPECT().GetLastInvoiceForYear(gomock.Any(), <span class="hljs-number">2024</span>).Return(*inv1, <span class="hljs-literal">nil</span>)

        repo.EXPECT().Create(gomock.Any(), invNew).Return(<span class="hljs-number">2</span>, <span class="hljs-literal">nil</span>)

        err := svc.Create(context.Background(), invNew)
        require.NoError(t, err)

        require.Equal(t, <span class="hljs-string">"1002/2024"</span>, invNew.InvoiceNumber)
    })

    t.Run(<span class="hljs-string">"when there is a database error while getting the last invoice"</span>, <span class="hljs-function"><span class="hljs-keyword">func</span><span class="hljs-params">(t *testing.T)</span></span> {
        inv := &amp;invoicehub.Invoice{
            IssueDate: time.Date(<span class="hljs-number">2024</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>, time.UTC),
        }

        repo.EXPECT().GetLastInvoiceForYear(gomock.Any(), <span class="hljs-number">2024</span>).Return(invoicehub.Invoice{}, errors.New(<span class="hljs-string">"something went wrong"</span>))

        err := svc.Create(context.Background(), inv)
        require.Error(t, err)
    })

    t.Run(<span class="hljs-string">"when we cannot save the invoice"</span>, <span class="hljs-function"><span class="hljs-keyword">func</span><span class="hljs-params">(t *testing.T)</span></span> {
        invoice := &amp;invoicehub.Invoice{
            IssueDate: time.Date(<span class="hljs-number">2024</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>, time.UTC),
        }

        repo.EXPECT().GetLastInvoiceForYear(gomock.Any(), <span class="hljs-number">2024</span>).Return(invoicehub.Invoice{}, invoicehub.ErrInvoiceNotFound)
        repo.EXPECT().Create(gomock.Any(), invoice).Return(<span class="hljs-number">0</span>, errors.New(<span class="hljs-string">"something went wrong"</span>))

        err := svc.Create(context.Background(), invoice)
        require.Error(t, err)
    })
}
</code></pre>
<p>Verify that the tests pass:</p>
<pre><code class="lang-bash">go <span class="hljs-built_in">test</span> -v ./...
</code></pre>
<p><strong>The Get method</strong></p>
<p>We have to implement the Get method of the InvoiceService now.</p>
<p>We start again from the unit tests and then we will proceed with the implementation.</p>
<pre><code class="lang-go"><span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">Test_invoiceSvc_Get</span><span class="hljs-params">(t *testing.T)</span></span> {
    mctrl := gomock.NewController(t)
    <span class="hljs-keyword">defer</span> mctrl.Finish()

    repo := mocks.NewMockInvoiceRepository(mctrl)

    svc := invoices.New(repo)

    t.Run(<span class="hljs-string">"when the invoice is found"</span>, <span class="hljs-function"><span class="hljs-keyword">func</span><span class="hljs-params">(t *testing.T)</span></span> {
        expected := invoicehub.Invoice{
            ID:            <span class="hljs-number">1</span>,
            InvoiceNumber: <span class="hljs-string">"1001/2024"</span>,
            IssueDate:     time.Date(<span class="hljs-number">2024</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>, time.UTC),
            SellerID:      <span class="hljs-number">1</span>,
            BuyerID:       <span class="hljs-number">2</span>,
            DaysToPay:     <span class="hljs-number">14</span>,
        }

        repo.EXPECT().Get(gomock.Any(), <span class="hljs-number">1</span>).Return(expected, <span class="hljs-literal">nil</span>)

        inv, err := svc.Get(context.Background(), <span class="hljs-number">1</span>)
        require.NoError(t, err)

        require.Equal(t, expected, inv)
    })

    t.Run(<span class="hljs-string">"when the invoice is not found"</span>, <span class="hljs-function"><span class="hljs-keyword">func</span><span class="hljs-params">(t *testing.T)</span></span> {
        repo.EXPECT().Get(gomock.Any(), <span class="hljs-number">1</span>).Return(invoicehub.Invoice{}, invoicehub.ErrInvoiceNotFound)

        _, err := svc.Get(context.Background(), <span class="hljs-number">1</span>)
        require.Error(t, err)
        require.ErrorIs(t, err, invoicehub.ErrInvoiceNotFound)
    })

    t.Run(<span class="hljs-string">"when there is a database error"</span>, <span class="hljs-function"><span class="hljs-keyword">func</span><span class="hljs-params">(t *testing.T)</span></span> {
        repo.EXPECT().Get(gomock.Any(), <span class="hljs-number">1</span>).Return(invoicehub.Invoice{}, errors.New(<span class="hljs-string">"something went wrong"</span>))

        _, err := svc.Get(context.Background(), <span class="hljs-number">1</span>)
        require.Error(t, err)
    })
}
</code></pre>
<p>The implementation is pretty straightforward here we just call the repo Get method.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1718781606643/7be84f57-0b33-4027-8e36-6a8b6bb41530.png" alt class="image--center mx-auto" /></p>
<p>Let's commit our work:</p>
<pre><code class="lang-bash">git add .
git commit -m <span class="hljs-string">"partial implementation of the InvoiceService"</span>
</code></pre>
<h2 id="heading-conclusion">Conclusion</h2>
<p>In today's blog post we continued the implementation of our <a target="_blank" href="https://blog.gkomninos.com/series/webapp-using-golang">Invoice management web application</a>. The new thing that we discussed today is how to use mocks in golang to assist you with unit testing.</p>
<p>Find all the code in the <a target="_blank" href="https://github.com/gosom/freelance-invoice-hub/tree/invoice-svc">github repo</a> .</p>
<p>In the next blog post we will continue the implementation of the InvoiceService by adding the required function to generate PDF using golang.</p>
<p>❤️ In case you found this article useful hit the Like button and follow me on <a target="_blank" href="https://x.com/gkomdev">X</a> .</p>
<div data-node-type="callout">
<div data-node-type="callout-emoji">💡</div>
<div data-node-type="callout-text">How do you do mock in Go? Please leave a comment</div>
</div>]]></content:encoded></item><item><title><![CDATA[Tutorial: Repository pattern in Golang with Test Driven Development]]></title><description><![CDATA[In the last blog post I showed you how you can implement the repository pattern in Golang. We used SQLite and GORM to implement the CompanyRepository .
Today I am going to show you how to implement the InvoiceRepository .The implementation will be ve...]]></description><link>https://blog.gkomninos.com/tutorial-repository-pattern-in-golang-with-test-driven-development</link><guid isPermaLink="true">https://blog.gkomninos.com/tutorial-repository-pattern-in-golang-with-test-driven-development</guid><category><![CDATA[General Programming]]></category><category><![CDATA[golang]]></category><category><![CDATA[coding]]></category><category><![CDATA[Web Development]]></category><dc:creator><![CDATA[Georgios Komninos]]></dc:creator><pubDate>Wed, 19 Jun 2024 05:00:40 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1718689392621/94038a6e-ba4d-40bb-b8f5-fa699a0568ac.webp" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In the last <a target="_blank" href="https://blog.gkomninos.com/tutorial-implementing-repository-with-gorm-and-sqlite?source=more_series_bottom_blogs">blog post</a> I showed you how you can implement the repository pattern in Golang. We used <a target="_blank" href="https://sqlite.org/">SQLite</a> and <a target="_blank" href="https://gorm.io/index.html">GORM</a> to implement the <code>CompanyRepository</code> .</p>
<p>Today I am going to show you how to implement the <code>InvoiceRepository</code> .<br />The implementation will be very similar but today we are going to start first by writing unit tests and then the actual implementation.</p>
<h3 id="heading-create-a-new-branch">Create a new branch</h3>
<p>We will work in a new git branch. Let's create this:</p>
<pre><code class="lang-bash">git checkout -b invoice-repo
</code></pre>
<h3 id="heading-our-interface">Our interface</h3>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1718690045267/b0768862-3a7b-4c83-8fef-fcf2af7728d1.png" alt class="image--center mx-auto" /></p>
<p>We have to implement a struct that implements these three methods:</p>
<ul>
<li><p><code>Create</code> : Inserts an invoice in the database and returns it's id</p>
</li>
<li><p><code>Get</code>: Fetches an invoices from the database using its id</p>
</li>
<li><p><code>GetLastInvoiceForYear</code>: Returns the last invoice from the provided year.</p>
</li>
</ul>
<h3 id="heading-writing-a-test-first">Writing a test first</h3>
<blockquote>
<p>In which package are we going to write our tests?</p>
</blockquote>
<p>In our project layout we have a package <code>sqlite</code> . This is where we are going to write our code.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1718690323826/b6b8448b-b09f-4594-851d-5d06f6280c3a.png" alt class="image--center mx-auto" /></p>
<p>Now create a new file named <code>invoice_test.go</code> .</p>
<p>Let's write our tests:</p>
<pre><code class="lang-go"><span class="hljs-keyword">package</span> sqlite_test

<span class="hljs-keyword">import</span> (
    <span class="hljs-string">"testing"</span>
)

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">Test_invoiceRepository</span><span class="hljs-params">(t *testing.T)</span></span> {
}
</code></pre>
<p>We don't test anything so far but let's verify that we can run our test:</p>
<pre><code class="lang-bash">go <span class="hljs-built_in">test</span> -v ./... -run=Test_invoiceRepository
</code></pre>
<p>the output should be similar to:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1718690716752/1290c071-5bea-4951-8416-c85a99a05db8.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-provide-a-dummy-implementation-of-the-interface">Provide a dummy implementation of the interface</h3>
<p>Create a file <code>sqlite/invoice.go</code> with contents:</p>
<pre><code class="lang-go"><span class="hljs-keyword">package</span> sqlite

<span class="hljs-keyword">import</span> (
    <span class="hljs-string">"context"</span>
    <span class="hljs-string">"invoicehub"</span>

    <span class="hljs-string">"gorm.io/gorm"</span>
)

<span class="hljs-keyword">var</span> _ invoicehub.InvoiceRepository = &amp;invoiceRepository{}

<span class="hljs-keyword">type</span> invoiceRepository <span class="hljs-keyword">struct</span> {
    db *gorm.DB
}

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">NewInvoiceRepository</span><span class="hljs-params">(db *gorm.DB)</span> <span class="hljs-title">invoicehub</span>.<span class="hljs-title">InvoiceRepository</span></span> {
    <span class="hljs-keyword">return</span> &amp;invoiceRepository{db}
}

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-params">(r *invoiceRepository)</span> <span class="hljs-title">Create</span><span class="hljs-params">(ctx context.Context, invoice *invoicehub.Invoice)</span> <span class="hljs-params">(<span class="hljs-keyword">int</span>, error)</span></span> {
    <span class="hljs-keyword">return</span> <span class="hljs-number">0</span>, <span class="hljs-literal">nil</span>
}

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-params">(r *invoiceRepository)</span> <span class="hljs-title">Get</span><span class="hljs-params">(ctx context.Context, id <span class="hljs-keyword">int</span>)</span> <span class="hljs-params">(invoicehub.Invoice, error)</span></span> {
    <span class="hljs-keyword">return</span> invoicehub.Invoice{}, <span class="hljs-literal">nil</span>
}

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-params">(r *invoiceRepository)</span> <span class="hljs-title">GetLastInvoiceForYear</span><span class="hljs-params">(ctx context.Context, year <span class="hljs-keyword">int</span>)</span> <span class="hljs-params">(invoicehub.Invoice, error)</span></span> {
    <span class="hljs-keyword">return</span> invoicehub.Invoice{}, <span class="hljs-literal">nil</span>
}
</code></pre>
<p>We only wrote the required methods but they are empty.</p>
<p>We now create our tests:</p>
<pre><code class="lang-go"><span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">compareInvoice</span><span class="hljs-params">(t *testing.T, a, b invoicehub.Invoice)</span></span> {
    require.Equal(t, a.ID, b.ID, <span class="hljs-string">"ID mismatch"</span>)
    require.Equal(t, a.InvoiceNumber, b.InvoiceNumber, <span class="hljs-string">"InvoiceNumber mismatch"</span>)
    require.Equal(t, a.IssueDate, b.IssueDate, <span class="hljs-string">"IssueDate mismatch"</span>)
    require.Equal(t, a.SellerID, b.SellerID, <span class="hljs-string">"SellerID mismatch"</span>)
    require.Equal(t, a.BuyerID, b.BuyerID, <span class="hljs-string">"BuyerID mismatch"</span>)
    require.Equal(t, a.DaysToPay, b.DaysToPay, <span class="hljs-string">"DaysToPay mismatch"</span>)

    require.Len(t, a.LineItems, <span class="hljs-built_in">len</span>(b.LineItems), <span class="hljs-string">"LineItems length mismatch"</span>)

    <span class="hljs-keyword">for</span> i := <span class="hljs-keyword">range</span> a.LineItems {
        require.Equal(t, a.LineItems[i].Description, b.LineItems[i].Description, <span class="hljs-string">"Description mismatch"</span>)
        require.Equal(t, a.LineItems[i].Amount.Currency, b.LineItems[i].Amount.Currency, <span class="hljs-string">"Currency mismatch"</span>)
        require.Equal(t, a.LineItems[i].Amount.Value.String(), b.LineItems[i].Amount.Value.String(), <span class="hljs-string">"Amount value mismatch"</span>)
    }
}


<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">Test_invoiceRepository</span><span class="hljs-params">(t *testing.T)</span></span> {
    db, err := sqlite.SetupDB(<span class="hljs-string">":memory:"</span>)
    require.NoError(t, err)

    repo := sqlite.NewInvoiceRepository(db)
    require.NotNil(t, repo)

    ctx := context.Background()

    invoice := invoicehub.Invoice{
        InvoiceNumber: <span class="hljs-string">"1001/24"</span>,
        IssueDate:     time.Date(<span class="hljs-number">2024</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>, time.UTC),
        SellerID:      <span class="hljs-number">1</span>,
        BuyerID:       <span class="hljs-number">2</span>,
        DaysToPay:     <span class="hljs-number">14</span>,

        LineItems: []invoicehub.LineItem{
            {
                Description: <span class="hljs-string">"My awesome services"</span>,
                Amount: invoicehub.Amount{
                    Value:    decimal.NewFromFloat(<span class="hljs-number">100.00</span>),
                    Currency: <span class="hljs-string">"EUR"</span>,
                },
                VatRate: decimal.NewFromFloat(<span class="hljs-number">0.19</span>),
            },
        },
    }

    <span class="hljs-comment">// Create a new invoice</span>
    id, err := repo.Create(ctx, &amp;invoice)
    require.NoError(t, err)
    require.NotZero(t, id)
    require.Equal(t, id, invoice.ID)

    <span class="hljs-comment">// fetch the invoice</span>

    invoice2, err := repo.Get(ctx, id)
    require.NoError(t, err)
    compareInvoice(t, invoice, invoice2)

    <span class="hljs-comment">// fetch a non-existing invoice</span>
    _, err = repo.Get(ctx, <span class="hljs-number">999</span>)
    require.Error(t, err)

    <span class="hljs-comment">// add one more invoice</span>
    invoice3 := invoicehub.Invoice{
        InvoiceNumber: <span class="hljs-string">"1002/24"</span>,
        IssueDate:     time.Date(<span class="hljs-number">2024</span>, <span class="hljs-number">2</span>, <span class="hljs-number">2</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>, time.UTC),
        SellerID:      <span class="hljs-number">1</span>,
        BuyerID:       <span class="hljs-number">2</span>,
        DaysToPay:     <span class="hljs-number">14</span>,

        LineItems: []invoicehub.LineItem{
            {
                Description: <span class="hljs-string">"web development"</span>,
                Amount: invoicehub.Amount{
                    Value:    decimal.NewFromFloat(<span class="hljs-number">100.00</span>),
                    Currency: <span class="hljs-string">"EUR"</span>,
                },
                VatRate: decimal.NewFromFloat(<span class="hljs-number">0.19</span>),
            },
            {
                Description: <span class="hljs-string">"web scraping"</span>,
                Amount: invoicehub.Amount{
                    Value:    decimal.NewFromFloat(<span class="hljs-number">150.00</span>),
                    Currency: <span class="hljs-string">"EUR"</span>,
                },
                VatRate: decimal.NewFromFloat(<span class="hljs-number">0.19</span>),
            },
        },
    }

    id3, err := repo.Create(ctx, &amp;invoice3)
    require.NoError(t, err)
    require.NotZero(t, id3)
    require.Equal(t, id3, invoice3.ID)

    <span class="hljs-comment">// fetch the last invoice for the year 2024</span>

    lastInvoice, err := repo.GetLastInvoiceForYear(ctx, <span class="hljs-number">2024</span>)
    require.NoError(t, err)
    compareInvoice(t, invoice3, lastInvoice)
}
</code></pre>
<p>In the above test:</p>
<ul>
<li><p>create an invoice</p>
</li>
<li><p>assert that our function returns a newly generated id</p>
</li>
<li><p>assert that the id generated is also set in our Invoice struct</p>
</li>
<li><p>try to fetch the invoice with that id</p>
</li>
<li><p>try to fetch a non existing invoice</p>
</li>
<li><p>create another invoice for a later date</p>
</li>
<li><p>assert that the last invoice of the year is the last created</p>
</li>
</ul>
<blockquote>
<p>We are using the package decimal.Decimal and testify package does not play well with it (bacause the way it uses reflection I think) . Thus, we created a custom compareInvoice function that performs the assertions</p>
</blockquote>
<p>now run the test again:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1718691691450/e4f1fe1c-5d02-486f-aa88-6ce38cc2851e.png" alt class="image--center mx-auto" /></p>
<p>As expected our test fails . We don't have any implementation yet.</p>
<p><strong>Separate our domain entity (invoicehub.Invoice) from the database models</strong></p>
<p>In <code>sqlite/invoice.go</code> let's define the required struct that reassembles our database table:</p>
<pre><code class="lang-go"><span class="hljs-keyword">type</span> dbinvoice <span class="hljs-keyword">struct</span> {
    ID            <span class="hljs-keyword">int</span>       <span class="hljs-string">`gorm:"primaryKey"`</span>
    InvoiceNumber <span class="hljs-keyword">string</span>    <span class="hljs-string">`gorm:"type:text"`</span>
    IssueDate     time.Time <span class="hljs-string">`gorm:"type:datetime"`</span>
    BuyerID       <span class="hljs-keyword">int</span>
    SellerID      <span class="hljs-keyword">int</span>
    DaysToPay     <span class="hljs-keyword">int</span>

    LineItems datatypes.JSONSlice[invoicehub.LineItem]
}
</code></pre>
<p>above we defined the db model (the table will be generated from that).</p>
<p>In order for Gorm to take care of the table creation modify the <code>sqlite/sqlite.go</code> .</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1718692694835/50dcb67f-d235-45ac-a547-960e8273a0f2.png" alt class="image--center mx-auto" /></p>
<blockquote>
<p>an attentive reader will have already noticed that the sellerID and buyerID fields do not have foreign keys that reference the dbcompanies table.</p>
</blockquote>
<p>We will leave these out for now. However this might lead to inconsistencies in our data at some point.</p>
<p><strong>Implement the Create method</strong></p>
<pre><code class="lang-go"><span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-params">(r *invoiceRepository)</span> <span class="hljs-title">Create</span><span class="hljs-params">(ctx context.Context, invoice *invoicehub.Invoice)</span> <span class="hljs-params">(<span class="hljs-keyword">int</span>, error)</span></span> {
    dbInvoice := dbinvoice{
        InvoiceNumber: invoice.InvoiceNumber,
        IssueDate:     invoice.IssueDate,
        BuyerID:       invoice.BuyerID,
        SellerID:      invoice.SellerID,
        DaysToPay:     invoice.DaysToPay,
        LineItems:     datatypes.NewJSONSlice(invoice.LineItems),
    }

    <span class="hljs-keyword">if</span> err := r.db.WithContext(ctx).Create(&amp;dbInvoice).Error; err != <span class="hljs-literal">nil</span> {
        <span class="hljs-keyword">return</span> <span class="hljs-number">0</span>, err
    }

    invoice.ID = dbInvoice.ID

    <span class="hljs-keyword">return</span> invoice.ID, <span class="hljs-literal">nil</span>
}
</code></pre>
<p><strong>Implement the Get method</strong></p>
<pre><code class="lang-go"><span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-params">(r *invoiceRepository)</span> <span class="hljs-title">Get</span><span class="hljs-params">(ctx context.Context, id <span class="hljs-keyword">int</span>)</span> <span class="hljs-params">(invoicehub.Invoice, error)</span></span> {
    <span class="hljs-keyword">var</span> dbitem dbinvoice
    <span class="hljs-keyword">if</span> err := r.db.WithContext(ctx).First(&amp;dbitem, id).Error; err != <span class="hljs-literal">nil</span> {
        <span class="hljs-keyword">return</span> invoicehub.Invoice{}, err
    }

    invoice := invoicehub.Invoice{
        ID:            dbitem.ID,
        InvoiceNumber: dbitem.InvoiceNumber,
        IssueDate:     dbitem.IssueDate,
        BuyerID:       dbitem.BuyerID,
        SellerID:      dbitem.SellerID,
        DaysToPay:     dbitem.DaysToPay,
        LineItems:     dbitem.LineItems,
    }

    <span class="hljs-keyword">return</span> invoice, <span class="hljs-literal">nil</span>
}
</code></pre>
<p><strong>Implement the GetLastInvoiceForYear</strong></p>
<pre><code class="lang-go"><span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-params">(r *invoiceRepository)</span> <span class="hljs-title">GetLastInvoiceForYear</span><span class="hljs-params">(ctx context.Context, year <span class="hljs-keyword">int</span>)</span> <span class="hljs-params">(invoicehub.Invoice, error)</span></span> {
    query := r.db.WithContext(ctx).
        Where(<span class="hljs-string">"strftime('%Y', issue_date) = ?"</span>, strconv.Itoa(year)).
        Order(<span class="hljs-string">"issue_date desc"</span>).
        Limit(<span class="hljs-number">1</span>)

    <span class="hljs-keyword">var</span> dbitem dbinvoice
    <span class="hljs-keyword">if</span> err := query.First(&amp;dbitem).Error; err != <span class="hljs-literal">nil</span> {
        <span class="hljs-keyword">return</span> invoicehub.Invoice{}, err
    }

    invoice := invoicehub.Invoice{
        ID:            dbitem.ID,
        InvoiceNumber: dbitem.InvoiceNumber,
        IssueDate:     dbitem.IssueDate,
        BuyerID:       dbitem.BuyerID,
        SellerID:      dbitem.SellerID,
        DaysToPay:     dbitem.DaysToPay,
        LineItems:     dbitem.LineItems,
    }

    <span class="hljs-keyword">return</span> invoice, <span class="hljs-literal">nil</span>
}
</code></pre>
<p>This one is a little bit more tricky.</p>
<p>We utilize the <code>strftime</code> of SQLite in order to fetch all invoices with issue date in the provided year.</p>
<blockquote>
<p>We don't have an index in the IssueDate column. This query might be slow when we have a lot of data.</p>
</blockquote>
<p>But, since this web application will be only for one user we will skip creating index beforehand. We will do this when it's required. But it's good to keep that in mind.</p>
<p>Now we run our tests again:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1718694275295/664e8b4b-9e99-49b3-9c9a-822d2f4881d0.png" alt class="image--center mx-auto" /></p>
<p>It looks we are good so let's commit</p>
<pre><code class="lang-go">git add .
git commit -m <span class="hljs-string">"implements invoice repository"</span>
</code></pre>
<p>You can find today's version in the github branch:</p>
<p><a target="_blank" href="https://github.com/gosom/freelance-invoice-hub/tree/invoice-repo">github invoice-repo branch</a></p>
<h3 id="heading-conclusion">Conclusion</h3>
<p>This blog posted shows the reader how to implement the repository pattern in Go using GORM and SQLite.</p>
<p>We seen how we can write unittests first for our database related code and then complete the implementation.</p>
<p>We once again flagged the need to separate your database models with your domain entities.</p>
<p>❤️ If you read the article until here and liked it please hit the like button.</p>
<p>👉 I would love you to follow me on <a target="_blank" href="https://x.com/gkomdev">X</a> or <a target="_blank" href="http://www.linkedin.com/in/georgios-komninos-172508147">LinkedIn</a></p>
<div data-node-type="callout">
<div data-node-type="callout-emoji">💡</div>
<div data-node-type="callout-text">What challenges have you faced while implementing the repository pattern in your projects?</div>
</div>]]></content:encoded></item><item><title><![CDATA[Tutorial:  Implementing Repository with GORM and SQLite]]></title><description><![CDATA[In the previous part of the series we created the required interface for you invoice generation/management web application.
In this post we are going to provide and implementation of the CompanyRepository .
In our requirements in the first part of th...]]></description><link>https://blog.gkomninos.com/tutorial-implementing-repository-with-gorm-and-sqlite</link><guid isPermaLink="true">https://blog.gkomninos.com/tutorial-implementing-repository-with-gorm-and-sqlite</guid><category><![CDATA[General Programming]]></category><category><![CDATA[Go Language]]></category><category><![CDATA[Web Development]]></category><category><![CDATA[Tutorial]]></category><dc:creator><![CDATA[Georgios Komninos]]></dc:creator><pubDate>Tue, 18 Jun 2024 05:00:47 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1718553790392/b59c8559-0598-457a-aba1-6ae9fa4dfbe9.webp" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In the <a target="_blank" href="https://blog.gkomninos.com/tutorial-defining-the-domain-entities">previous</a> part of the <a target="_blank" href="https://blog.gkomninos.com/series/webapp-using-golang">series</a> we created the required interface for you invoice generation/management web application.</p>
<p>In this post we are going to provide and implementation of the <code>CompanyRepository</code> .</p>
<p>In our requirements in the <a target="_blank" href="https://blog.gkomninos.com/series/webapp-using-golang">first part</a> of the series we decided that we are going to use SQLite as our DBMS. Let's implement together the integration with SQLite.</p>
<h3 id="heading-implementing-the-companyrepository">Implementing the CompanyRepository</h3>
<p>We are going to use <a target="_blank" href="https://gorm.io/index.html">GORM</a> to interact with the database. However, we are going to ensure that our database models and code will be decoupled so we can switch ORM/driver or even database in the future.</p>
<p>We are going to work in the sqlite package.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1718554572316/c0807d86-3dd7-4ff7-8dc6-9b103e33204b.png" alt class="image--center mx-auto" /></p>
<p>Let's first create a branch</p>
<pre><code class="lang-bash"> git checkout -b company-repo
</code></pre>
<p>and then install the dependencies:</p>
<pre><code class="lang-bash">go get gorm.io/gorm
go get gorm.io/driver/sqlite
go get gorm.io/datatypes<span class="hljs-string">"</span>
</code></pre>
<p>then in our <code>sqlite/sqlite.go</code> add</p>
<pre><code class="lang-go"><span class="hljs-keyword">package</span> sqlite

<span class="hljs-keyword">import</span> (
    <span class="hljs-string">"gorm.io/driver/sqlite"</span>
    <span class="hljs-string">"gorm.io/gorm"</span>
)

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">SetupDB</span><span class="hljs-params">(path <span class="hljs-keyword">string</span>)</span> <span class="hljs-params">(*gorm.DB, error)</span></span> {
    db, err := gorm.Open(sqlite.Open(path), &amp;gorm.Config{})
    <span class="hljs-keyword">if</span> err != <span class="hljs-literal">nil</span> {
        <span class="hljs-keyword">return</span> <span class="hljs-literal">nil</span>, err
    }

    err = db.AutoMigrate(
        &amp;dbcompany{},
    )
    <span class="hljs-keyword">if</span> err != <span class="hljs-literal">nil</span> {
        <span class="hljs-keyword">return</span> <span class="hljs-literal">nil</span>, err
    }

    <span class="hljs-keyword">return</span> db, <span class="hljs-literal">nil</span>
}
</code></pre>
<p>Explanation:</p>
<p>SetupDB function creates a connection to the SQLite database and runs the migrations, meaning it creates the required tables if not exists or modifies them according to our schema.</p>
<p>and then create a file <code>sqlite/company.go</code></p>
<pre><code class="lang-go"><span class="hljs-keyword">package</span> sqlite

<span class="hljs-keyword">import</span> (
    <span class="hljs-string">"context"</span>
    <span class="hljs-string">"invoicehub"</span>

    <span class="hljs-string">"gorm.io/datatypes"</span>
    <span class="hljs-string">"gorm.io/gorm"</span>
)

<span class="hljs-keyword">var</span> _ invoicehub.CompanyRepository = (*companyRepository)(<span class="hljs-literal">nil</span>)

<span class="hljs-keyword">type</span> dbcompany <span class="hljs-keyword">struct</span> {
    ID           <span class="hljs-keyword">int</span>    <span class="hljs-string">`gorm:"primaryKey"`</span>
    Name         <span class="hljs-keyword">string</span> <span class="hljs-string">`gorm:"type:text"`</span>
    Address      datatypes.JSONType[invoicehub.Address]
    Email        <span class="hljs-keyword">string</span> <span class="hljs-string">`gorm:"type:text"`</span>
    TaxID        <span class="hljs-keyword">string</span> <span class="hljs-string">`gorm:"type:text"`</span>
    VatID        <span class="hljs-keyword">string</span> <span class="hljs-string">`gorm:"type:text"`</span>
    BankAccounts datatypes.JSONSlice[invoicehub.BankAccount]
}

<span class="hljs-keyword">type</span> companyRepository <span class="hljs-keyword">struct</span> {
    db *gorm.DB
}

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">NewCompanyRepository</span><span class="hljs-params">(db *gorm.DB)</span> <span class="hljs-title">invoicehub</span>.<span class="hljs-title">CompanyRepository</span></span> {
    <span class="hljs-keyword">return</span> &amp;companyRepository{db: db}
}

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-params">(r *companyRepository)</span> <span class="hljs-title">Create</span><span class="hljs-params">(ctx context.Context, company *invoicehub.Company)</span> <span class="hljs-params">(<span class="hljs-keyword">int</span>, error)</span></span> {
    <span class="hljs-comment">// Convert invoicehub.Company to dbcompany</span>
    dbCompany := dbcompany{
        Name:         company.Name,
        Address:      datatypes.NewJSONType(company.Address),
        Email:        company.Email,
        TaxID:        company.TaxID,
        VatID:        company.VatID,
        BankAccounts: datatypes.NewJSONSlice(company.BankAccounts),
    }

    <span class="hljs-comment">// Save to database</span>

    <span class="hljs-keyword">if</span> err := r.db.WithContext(ctx).Create(&amp;dbCompany).Error; err != <span class="hljs-literal">nil</span> {
        <span class="hljs-keyword">return</span> <span class="hljs-number">0</span>, err
    }

    company.ID = dbCompany.ID

    <span class="hljs-keyword">return</span> company.ID, <span class="hljs-literal">nil</span>
}

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-params">(r *companyRepository)</span> <span class="hljs-title">Update</span><span class="hljs-params">(ctx context.Context, company *invoicehub.Company)</span> <span class="hljs-title">error</span></span> {
    <span class="hljs-keyword">var</span> existingCompany dbcompany
    <span class="hljs-keyword">if</span> err := r.db.WithContext(ctx).First(&amp;existingCompany, company.ID).Error; err != <span class="hljs-literal">nil</span> {
        <span class="hljs-keyword">return</span> err
    }

    existingCompany.Name = company.Name
    existingCompany.Address = datatypes.NewJSONType(company.Address)
    existingCompany.Email = company.Email
    existingCompany.TaxID = company.TaxID
    existingCompany.VatID = company.VatID
    existingCompany.BankAccounts = datatypes.NewJSONSlice(company.BankAccounts)

    <span class="hljs-keyword">if</span> err := r.db.WithContext(ctx).Save(&amp;existingCompany).Error; err != <span class="hljs-literal">nil</span> {
        <span class="hljs-keyword">return</span> err
    }

    <span class="hljs-keyword">return</span> <span class="hljs-literal">nil</span>
}

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-params">(r *companyRepository)</span> <span class="hljs-title">Get</span><span class="hljs-params">(ctx context.Context, id <span class="hljs-keyword">int</span>)</span> <span class="hljs-params">(invoicehub.Company, error)</span></span> {
    <span class="hljs-keyword">var</span> dbCompany dbcompany
    <span class="hljs-keyword">if</span> err := r.db.WithContext(ctx).First(&amp;dbCompany, id).Error; err != <span class="hljs-literal">nil</span> {
        <span class="hljs-keyword">return</span> invoicehub.Company{}, err
    }

    company := invoicehub.Company{
        ID:           dbCompany.ID,
        Name:         dbCompany.Name,
        Address:      dbCompany.Address.Data(),
        Email:        dbCompany.Email,
        TaxID:        dbCompany.TaxID,
        VatID:        dbCompany.VatID,
        BankAccounts: dbCompany.BankAccounts,
    }

    <span class="hljs-keyword">return</span> company, <span class="hljs-literal">nil</span>
}
</code></pre>
<p>❓why I used a dbcompany structs that basically reassembles our invoicehub.Company struct</p>
<p>We want our repository implementation to be decoupled with the domain Entity. The reason is flexibility to change in the future. We don't want our domain Entity to include any gorm related code (see the struct tags).</p>
<h3 id="heading-writing-unit-tests">Writing unit tests</h3>
<p>We have to verify that our implementation works. Let's write a unit test for that.  </p>
<p>For unit test I like to use the <a target="_blank" href="https://github.com/stretchr/testify">testify</a> package</p>
<pre><code class="lang-bash">go get github.com/stretchr/testify
</code></pre>
<p>Create a file <code>sqlite/company_test.go</code> .</p>
<pre><code class="lang-go"><span class="hljs-keyword">package</span> sqlite_test

<span class="hljs-keyword">import</span> (
    <span class="hljs-string">"context"</span>
    <span class="hljs-string">"invoicehub"</span>
    <span class="hljs-string">"invoicehub/sqlite"</span>
    <span class="hljs-string">"testing"</span>

    <span class="hljs-string">"github.com/stretchr/testify/require"</span>
)

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">Test_companyRepository</span><span class="hljs-params">(t *testing.T)</span></span> {
    db, err := sqlite.SetupDB(<span class="hljs-string">":memory:"</span>)
    require.NoError(t, err)

    repo := sqlite.NewCompanyRepository(db)
    ctx := context.Background()

    company := invoicehub.Company{
        Name:  <span class="hljs-string">"Example Ltd"</span>,
        Email: <span class="hljs-string">"info@example.com"</span>,
        TaxID: <span class="hljs-string">"123456789"</span>,
        VatID: <span class="hljs-string">"987654321"</span>,
        Address: invoicehub.Address{
            Street:     <span class="hljs-string">"123 Example St"</span>,
            City:       <span class="hljs-string">"Limassol"</span>,
            PostalCode: <span class="hljs-string">"12345"</span>,
            Country:    <span class="hljs-string">"CY"</span>,
        },
        BankAccounts: []invoicehub.BankAccount{
            {
                BankName:      <span class="hljs-string">"Example Bank"</span>,
                AccountNumber: <span class="hljs-string">"1234567890"</span>,
                IBAN:          <span class="hljs-string">"EX12345678901234567890"</span>,
                BIC:           <span class="hljs-string">"EXBIC123"</span>,
            },
        },
    }

    <span class="hljs-comment">// Create a new company</span>
    id, err := repo.Create(ctx, &amp;company)
    require.NoError(t, err)
    require.NotZero(t, id)
    require.Equal(t, id, company.ID)

    <span class="hljs-comment">// Get the company</span>
    company2, err := repo.Get(ctx, id)
    require.NoError(t, err)
    require.Equal(t, company, company2)

    <span class="hljs-comment">// Update the company</span>
    company.Name = <span class="hljs-string">"Example Ltd 2"</span>
    err = repo.Update(ctx, &amp;company)
    require.NoError(t, err)

    updatedCompany, err := repo.Get(ctx, id)
    require.NoError(t, err)
    require.Equal(t, company, updatedCompany)

    <span class="hljs-comment">// not found</span>
    _, err = repo.Get(ctx, <span class="hljs-number">999</span>)
    require.Error(t, err)
}
</code></pre>
<p>💥 We cover just the basics test cases so we have some confidence that it works.</p>
<p>Let's run the tests</p>
<pre><code class="lang-bash">go <span class="hljs-built_in">test</span> -v ./...
</code></pre>
<p>and you should get something like:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1718557926092/177442a8-6009-4f09-856e-0b6de6114315.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-pass-the-dependency-in-our-maingo">Pass the dependency in our main.go</h3>
<p>Open cmd/main.go and do the following changes:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1718558262745/caac1712-5fe5-41d6-bcd8-0ed4922d881c.png" alt class="image--center mx-auto" /></p>
<p>Pay attention, that I set a default path that will work with our docker setup</p>
<p>Also add to the <code>.gitignore</code> the <code>data.sqlite3</code> file.</p>
<pre><code class="lang-bash"><span class="hljs-built_in">echo</span> <span class="hljs-string">"data.sqlite3"</span> &gt;&gt; .gitignore
</code></pre>
<p>Now let's test our docker setup.</p>
<pre><code class="lang-bash">make dev
</code></pre>
<p>notice the created file:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1718558412661/56b1f567-3600-4780-919b-c123e4758126.png" alt class="image--center mx-auto" /></p>
<p>Let's open the file with an sqlite client. On Linux run:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1718558551372/07da25e5-c344-465d-8c39-08fff45f3670.png" alt class="image--center mx-auto" /></p>
<p>Let's now commit our code so far</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1718558629539/bfeaec18-1415-4df1-9c32-287f11585599.png" alt class="image--center mx-auto" /></p>
<p>As usual you can find the code so far in the relate <a target="_blank" href="https://github.com/gosom/freelance-invoice-hub/tree/company-repo">github branch</a> .</p>
<h3 id="heading-deployment">Deployment</h3>
<p>We need to setup a new environment variable in our production <code>.env</code> .</p>
<p>Login to your VPS and add the line:</p>
<pre><code class="lang-bash">FH_DB_PATH=invoices.sqlite3
</code></pre>
<p>Now you can go back to your machine and run</p>
<pre><code class="lang-bash">make deploy
</code></pre>
<p>After deployment check that the invoices.sqlite3 file was created .</p>
<h3 id="heading-conclusion">Conclusion</h3>
<p>In this tutorial, we successfully implemented the <code>CompanyRepository</code> using GORM and SQLite, ensuring our code remains flexible and decoupled from the domain entity. We also wrote unit tests to verify our implementation and made necessary adjustments to our <code>main.go</code> to integrate the repository. Finally, we configured our production environment and verified the deployment.</p>
<p>In the next article we are going to implement the <code>InvoiceRepository</code> .</p>
<p>❤️ Please follow me on <a target="_blank" href="https://x.com/gkomdev">X</a> or <a target="_blank" href="http://www.linkedin.com/in/georgios-komninos-172508147">LinkedIn</a></p>
<p>If you have any questions please reach out to me via a comment or a DM in X.</p>
<p>Find the rest of the tutorials in this series <a target="_blank" href="https://blog.gkomninos.com/series/webapp-using-golang">here</a></p>
]]></content:encoded></item><item><title><![CDATA[Tutorial: Defining the Domain entities]]></title><description><![CDATA[This blog is the part of the series Building a Web App with Golang.
In today's article we are going to define the basic entities and the operations on them for our web application. In the first part we defined the scope of the application.
We are goi...]]></description><link>https://blog.gkomninos.com/tutorial-defining-the-domain-entities</link><guid isPermaLink="true">https://blog.gkomninos.com/tutorial-defining-the-domain-entities</guid><category><![CDATA[Go Language]]></category><category><![CDATA[#Domain-Driven-Design]]></category><category><![CDATA[Tutorial]]></category><category><![CDATA[Web Development]]></category><dc:creator><![CDATA[Georgios Komninos]]></dc:creator><pubDate>Mon, 17 Jun 2024 05:00:27 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1718522427679/d009303d-842d-424e-b7cd-03dd6c48bde4.webp" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>This blog is the part of the series <a target="_blank" href="https://blog.gkomninos.com/series/webapp-using-golang"><strong>Building a Web App with Golang</strong></a><strong>.</strong></p>
<p>In today's article we are going to define the basic entities and the operations on them for our web application. In the <a target="_blank" href="https://blog.gkomninos.com/crafting-a-web-application-with-golang-a-step-by-step-guide">first part</a> we defined the scope of the application.</p>
<p>We are going to build a system that can issue issue invoices for clients.</p>
<p>Let's create a new git branch to work on:</p>
<pre><code class="lang-bash">git checkout -b basic-domain
</code></pre>
<p><strong>Company</strong></p>
<p>Every client and "us" (our company/the freelancer) will be represented by an entity <code>Company</code> .</p>
<p>Let's create a file named <code>company.go</code> in the root of our project</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1718515845175/f16a0a8b-cf20-4369-bc1b-c87d5f2fea4d.png" alt class="image--center mx-auto" /></p>
<p>then we define a struct with the basic attributes we have:</p>
<pre><code class="lang-go"><span class="hljs-keyword">package</span> invoicehub

<span class="hljs-keyword">type</span> Company <span class="hljs-keyword">struct</span> {
    ID           <span class="hljs-keyword">int</span>
    Name         <span class="hljs-keyword">string</span>
    Address      Address
    Email        <span class="hljs-keyword">string</span>
    TaxID        <span class="hljs-keyword">string</span>
    VatID        <span class="hljs-keyword">string</span>
    BankAccounts []BankAccount
}

<span class="hljs-keyword">type</span> Address <span class="hljs-keyword">struct</span> {
    Street     <span class="hljs-keyword">string</span>
    City       <span class="hljs-keyword">string</span>
    PostalCode <span class="hljs-keyword">string</span>
    Country    <span class="hljs-keyword">string</span>
}

<span class="hljs-keyword">type</span> BankAccount <span class="hljs-keyword">struct</span> {
    BankName      <span class="hljs-keyword">string</span>
    AccountNumber <span class="hljs-keyword">string</span>
    IBAN          <span class="hljs-keyword">string</span>
    BIC           <span class="hljs-keyword">string</span>
}
</code></pre>
<p>The idea is that we will represent each real life company using the above struct.</p>
<p><strong>Invoice</strong></p>
<p>Create an <code>invoice.go</code> in the root of the project</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1718516394251/a8486aac-cfe5-4ebf-aab2-5e0003c36db0.png" alt class="image--center mx-auto" /></p>
<p>Each invoice will have at least the following:<br />- Seller<br />- Buyer<br />- Amount<br />- Issue Date<br />- Invoice Number<br />- Days (number of days after issuance that must be paid).</p>
<p>We need to install first one dependency to handle decimal numbers:</p>
<pre><code class="lang-bash">get github.com/shopspring/decimal
</code></pre>
<p>Now let's start with that:</p>
<pre><code class="lang-bash">package invoicehub

import (
    <span class="hljs-string">"time"</span>

    <span class="hljs-string">"github.com/shopspring/decimal"</span>
)

<span class="hljs-built_in">type</span> Invoice struct {
    ID            int
    InvoiceNumber string
    IssueDate     time.Time
    SellerID      int
    BuyerID       int
    DaysToPay     int

    LineItems []LineItem
}

<span class="hljs-built_in">type</span> LineItem struct {
    Description string
    Amount      Amount
    VatRate     decimal.Decimal
}

<span class="hljs-built_in">type</span> Amount struct {
    Value    decimal.Decimal
    Currency string
}
</code></pre>
<p>The above structs should satisfy our needs for now.</p>
<p>Pay, attention that I am using ids for Buyer and Seller. We could have used a Company struct there.</p>
<p><mark>Why I choose to use ids instead of a company struct?</mark></p>
<p>Our application now is a "monolithic" application. In the future we might want to split into different microservices (maybe one microservice to handle the companies and one microservice to handle the invoices). Let's be proactive and try to decouple the Companies with the Invoices.</p>
<p>I does not look that we need other structs to represent our business domain. Of-course , we might forgotten something but then we will change.</p>
<p><strong>Operations</strong></p>
<p>Our application is pretty simple:</p>
<p>For a Company we need to be able to:</p>
<p><strong>C</strong>reate a company<br /><strong>R</strong>ead a company<br /><strong>U</strong>pdate a company</p>
<p>Our requirements don't have a Delete so we skip for now.</p>
<p>Creating, Reading, Updating a company are straightforward.</p>
<p>First we need a way to do these operations in our storage layer.<br />For this we are going to create an interface <code>CompanyRepository</code></p>
<p>Let's do that:</p>
<p>in your <code>company.go</code> add the following:</p>
<pre><code class="lang-bash"><span class="hljs-built_in">type</span> CompanyRepository interface {
    Create(ctx context.Context, company *Company) (int, error)
    Update(ctx context.Context, company *Company) error
    Get(ctx context.Context, id int) (Company, error)
}
</code></pre>
<p>Now for our invoices:</p>
<p>We need to be able to :</p>
<p><strong>C</strong>reate Invoice<br /><strong>R</strong>ead Invoice<br />Create a PDF from an invoice</p>
<p>we don't have full CRUD (Create Read Update Delete) since we don't want to Update or Delete invoices.</p>
<p>Let's define the repository also for invoices:</p>
<p>in the <code>invoice.go</code> add:</p>
<pre><code class="lang-go"><span class="hljs-keyword">type</span> InvoiceRepository <span class="hljs-keyword">interface</span> {
    Create(ctx context.Context, invoice *Invoice) (<span class="hljs-keyword">int</span>, error)
    Get(ctx context.Context, id <span class="hljs-keyword">int</span>) (Invoice, error)
}
</code></pre>
<p>What about PDF generation here? This is not a database operation.<br />Additionally, what about creation of an invoice? It is a little bit more complex.</p>
<p>In the <a target="_blank" href="https://blog.gkomninos.com/crafting-a-web-application-with-golang-a-step-by-step-guide">requirements</a> we had that the <code>InvoiceNumber</code> looks like:</p>
<blockquote>
<p>1031/24</p>
</blockquote>
<p>so we have two parts:<br />1. a number that resets every year (1031)<br />2. the year (24)</p>
<p>So the next invoice should look like 1032/24 but if it is the first for 2025 should look like 1001/24 .</p>
<p>Notice that we start always with 1000.</p>
<p>This is just one way others, might do it differently.</p>
<p>It would be nice to have a generic way to compute that per customer. <mark>BUT, do not forget our requirements. We want to code for now and not for imaginary use cases. Let's get stuff done</mark></p>
<p>Since invoices are a bit more complex let's also create a Service Layer that will handle the business logic .</p>
<p>open <code>invoices.go</code> and add:</p>
<pre><code class="lang-go"><span class="hljs-keyword">type</span> InvoiceService <span class="hljs-keyword">interface</span> {
    Create(ctx context.Context, invoice *Invoice) error
    Get(ctx context.Context, id <span class="hljs-keyword">int</span>) (Invoice, error)
    CreatePDF(ctx context.Context, id <span class="hljs-keyword">int</span>) ([]<span class="hljs-keyword">byte</span>, error)
}
</code></pre>
<p>We also need an extra method in the Repository that returns the last invoice of the year. We are going to use that to compute the invoice number</p>
<p>Add the <code>GetLastInvoiceForYear</code> in the repository</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1718519776641/4605eaa9-cbc4-4eaa-97ce-effefd15fc5d.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-adding-the-interfaces-in-our-http-layer">Adding the interfaces in our HTTP layer</h3>
<p>Now that we have "described" the operations of our system via the interfaces let's see where we are going to use them.</p>
<p>in the <code>http/router.go</code> we defined a base handler :</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1718520492810/d072a638-c954-4c73-b6f9-d8947e24bed1.png" alt class="image--center mx-auto" /></p>
<p>Let's add the interfaces there:</p>
<pre><code class="lang-go"><span class="hljs-keyword">type</span> baseHandler <span class="hljs-keyword">struct</span> {
    e         *echo.Echo
    companies invoicehub.CompanyRepository
    invoices  invoicehub.InvoiceService
}
</code></pre>
<p>so when we create the baseHandler we should inject the implementations of the interfaces:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1718520651156/afaba281-bee9-4f02-aa70-e71168895a04.png" alt class="image--center mx-auto" /></p>
<p>Since we don't have any concrete implementation yet let's pass nil where we call the NewRouter function in our <code>cmd/main.go</code> .</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1718520733482/2d4aa839-121c-41f4-81a3-6f7b41785e64.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-commit-and-push">Commit and Push</h3>
<p>let's commit our code and push our branch</p>
<pre><code class="lang-go">git add .
git commit -m <span class="hljs-string">"add basic domain entities"</span>
git push origin basic-domain
</code></pre>
<p>And as usual, you can find today's code in the github <a target="_blank" href="https://github.com/gosom/freelance-invoice-hub/tree/basic-domain">branch</a></p>
<h3 id="heading-summary-and-what-is-next">Summary and what is next</h3>
<p>In today's blog post we started implementing our business logic. We first defined the required structs and the operations that we can do on these.<br />The operations are defined using interfaces. Utilizing interfaces allows us to think about our business logic without having to think about implementation details. Additionally, it will allow us to easily unit test our application later.</p>
<p>I would like to share a piece of advice here. <mark>When you have to develop an application no matter how complex it is try first to focus on what is your goal</mark>. Don't fall into the trap of questioning your self all the time like:  </p>
<p>what if the user wants also that?<br />what if this changes ?<br />what if ... ?  </p>
<p>These what if may never happen. If they do be confident that when it's required you will handle it.</p>
<p>I am not saying or implying do not think about scalability, changes of requirements etc . I am saying think about them try to code in a way that some decisions are deferred but don't waste your time trying to code something smart when it is not required.</p>
<p>The moments I am writing these lines I am thinking the following:  </p>
<p>- How am I distinguish which company is the Seller when we don't have a User entity ?<br />- What if when we create an invoice someone else does it parallel (so the invoice number may come wrong?_<br />- How am i going to create an invoice number when I do it near the end of the year and I use UTC in the system but my current timezone is UTC+3 and I am in the next year?<br />- ...  </p>
<p>and many many more.  </p>
<p>Some of these are maybe valid concerns. But so far we don't have anything working yet. <mark>We don't have to aim for perfection, a working software is better that no software in the end.</mark></p>
<p><strong>Links to the other part of that series</strong></p>
<p><a target="_blank" href="https://blog.gkomninos.com/crafting-a-web-application-with-golang-a-step-by-step-guide">https://blog.gkomninos.com/crafting-a-web-application-with-golang-a-step-by-step-guide</a></p>
<p><a target="_blank" href="https://blog.gkomninos.com/setting-up-a-docker-development-enviroment-for-go">https://blog.gkomninos.com/setting-up-a-docker-development-enviroment-for-go</a></p>
<p><a target="_blank" href="https://blog.gkomninos.com/building-a-robust-web-server-in-go-a-step-by-step-guide">https://blog.gkomninos.com/building-a-robust-web-server-in-go-a-step-by-step-guide</a></p>
<p><a target="_blank" href="https://blog.gkomninos.com/tutorial-deployment-of-golang-web-app-using-systemd">https://blog.gkomninos.com/tutorial-deployment-of-golang-web-app-using-systemd</a></p>
<p>👍 Please leave a comment or like the article if you like it. This will help me gain visibility.</p>
<p>❤️ Following me on Github or <a target="_blank" href="https://x.com/gkomdev">X</a> or <a target="_blank" href="http://www.linkedin.com/in/georgios-komninos-172508147">LinkedIn</a> motivates me keep writing.</p>
]]></content:encoded></item><item><title><![CDATA[Tutorial: Deployment of Golang web app using Systemd]]></title><description><![CDATA[Today, I am going to show you a simple way of deploying a Golang web application.
We are going to use Systemd and a Makefile to deploy the code when we merge to main branch. In a future blog post we will revisit and show you how you can deploywhen yo...]]></description><link>https://blog.gkomninos.com/tutorial-deployment-of-golang-web-app-using-systemd</link><guid isPermaLink="true">https://blog.gkomninos.com/tutorial-deployment-of-golang-web-app-using-systemd</guid><category><![CDATA[Go Language]]></category><category><![CDATA[Tutorial]]></category><category><![CDATA[Golang web development]]></category><dc:creator><![CDATA[Georgios Komninos]]></dc:creator><pubDate>Sun, 16 Jun 2024 05:00:20 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1718468171692/0b74707f-8c64-4882-b9a1-81bc17b20d57.webp" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Today, I am going to show you a simple way of deploying a Golang web application.</p>
<p>We are going to use <a target="_blank" href="https://systemd.io/">Systemd</a> and a Makefile to deploy the code when we merge to main branch. In a future blog post we will revisit and show you how you can deploy<br />when you push to main by utilizing Github Pipelines.</p>
<p>Let's get started 🥳</p>
<h3 id="heading-setting-up-systemd">Setting up Systemd</h3>
<p>I assume that you already have a linux VPS. If not, then I recommend you create an account in Digital Ocean using this <a target="_blank" href="https://m.do.co/c/c11136c4693c">link</a> . If you register using this link you will get $200 in credit and I will get $25 in case you continue using this. Basically, you can try this for free so give it a try .</p>
<p>The instructions will be for an Ubuntu VPS so if you create a new one now please use the Ubuntu image.</p>
<h3 id="heading-setting-up-dns">Setting up DNS</h3>
<p>Configure your domain to point to your server's ip address.</p>
<p>then <code>ssh</code> into the VPS and do:</p>
<p><strong>In your VPS</strong></p>
<pre><code class="lang-bash">sudo touch /etc/systemd/system/invoice-hub.service
mkdir -p /home/giorgos/invoice-server
touch /home/giorgos/invoice-server/.env
sudo mkdir -p /.cache/.certs &amp;&amp; sudo chown -R giorgos:giorgos /.cache/.certs
</code></pre>
<p><strong><mark>☢️ adjust the paths and the USER</mark></strong></p>
<p>Then open with an editor and paste:</p>
<pre><code class="lang-bash">[Unit]
Description=Freelance invoice hub service

[Install]
WantedBy=multi-user.target

[Service]
Type=simple
ExecStart=/home/giorgos/invoice-server/invoice-server
WorkingDirectory=/home/giorgos/invoice-server
EnvironmentFile=/home/giorgos/invoice-server/.env
Restart=always
RestartSec=5
StandardOutput=syslog
StandardError=syslog
SyslogIdentifier=%n
</code></pre>
<p><mark>⚠️ modify the paths to match your servers.</mark></p>
<p>❕this is not the "best" systemd config. However we want to deploy fast and this is good enough. We will revisit in the future</p>
<p>The being still in your VPS edit the <code>.env</code> file you created and add:</p>
<pre><code class="lang-bash">FIH_DOMAIN=&lt;YOUR-DOMAIN&gt;
</code></pre>
<p><mark>Obviously, replace with your domain.</mark></p>
<p>Now let's reload the systemd and enable the service</p>
<pre><code class="lang-bash">sudo systemctl daemon-reload
sudo systemctl <span class="hljs-built_in">enable</span> invoice-hub
</code></pre>
<p><strong>In your local computer</strong></p>
<p>Now in your local computer build the program:</p>
<pre><code class="lang-bash">GOOS=linux GOARCH=amd64 go build -o invoice-server cmd/main.go
</code></pre>
<p>This will create a file invoice-server .<br />If your VPS uses a different OS or architecture please use the correct enviroment variables. See here for a <a target="_blank" href="https://www.digitalocean.com/community/tutorials/how-to-build-go-executables-for-multiple-platforms-on-ubuntu-16-04">tutorial</a> .</p>
<p>Now let's manually copy the file to the server</p>
<pre><code class="lang-bash">scp invoice-server &lt;USERNAME&gt;@&lt;server-ip&gt;:~/invoice-server/invoice-server
</code></pre>
<p>The above command just copies the executable you build to the servers path.  </p>
<p>If not already done I highly recommend you create an public/private key pair and use that to login to your server .</p>
<p>This tutorial is not about that, so if you don't know how to do it google or ask ChatGPT .</p>
<p>I like to add an entry in my ~/.ssh/config file like:</p>
<pre><code class="lang-bash">Host &lt;just-a-name&gt;
HostName &lt;server-ip&gt;
User giorgos
IdentityFile ~/.ssh/id_rsa
Port 22
</code></pre>
<p>Highly recommend to do that.</p>
<p>Then you can ssh using <code>ssh just-a-name</code> .</p>
<p>Enough, with that. I assume you managed to copy your file to the server now.</p>
<p>login again to your VPS</p>
<p>and run:</p>
<pre><code class="lang-bash">sudo systemctl start invoice-hub
</code></pre>
<p>Now check the status:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1718465830052/08b5b5df-87e1-441f-929b-0151587e69aa.png" alt class="image--center mx-auto" /></p>
<p>And if you did everything correct then you can visit your new web app</p>
<p>https://YOUR-DOMAIN.com</p>
<p>and you should see the</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1718465893901/e721c848-9d74-4a11-a4f2-a896cbc0937a.png" alt class="image--center mx-auto" /></p>
<p>🚀🚀🚀🚀🚀</p>
<p>Because it's a little bit tedious to manually build, copy the executable and the run stop and start the service let's automate a bit.</p>
<p>We are going to utilize our make file</p>
<p>Let's create a new branch:</p>
<pre><code class="lang-bash">git checkout -b deploy-Makefile-auth
</code></pre>
<p>and add the following in your Makefile</p>
<pre><code class="lang-bash">deploy: <span class="hljs-comment">## deploys to the remote server</span>
    GOOS=linux GOARCH=amd64 go build -o invoice-server cmd/main.go
    scp invoice-server contabo-main:~/invoice-server/invoice-server.2
    ssh myserver sudo systemctl stop invoice-hub
    ssh myserver sudo mv invoice-server/invoice-server.2 invoice-server/invoice-server
    ssh myserver sudo systemctl start invoice-hub
</code></pre>
<p><strong><mark>Please replace myserver with your server Host (as you configured in your ~/.ssh/config OR use username@server-ip instead</mark></strong></p>
<p>let's do a change and deploy.</p>
<p>Instead of <code>HELLO WORLD</code> lets make our handler to return</p>
<p><code>THANK YOU</code></p>
<p>Change your http/router.go as in the image</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1718466476304/97d38fa1-20b7-4816-9bad-d2ed5419b699.png" alt class="image--center mx-auto" /></p>
<p>and then run</p>
<pre><code class="lang-bash">make deploy
</code></pre>
<p>wait until the deployment finishes and go to you website</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1718466562453/dc5545dc-1242-4ea1-9612-20ad2d7fa280.png" alt class="image--center mx-auto" /></p>
<p>You should see the above .</p>
<h3 id="heading-ia"> </h3>
<p>Basic HTTP Auth</p>
<p>We don't want our web app to be accessible to everyone. Let's configure Basic HTTP Auth.</p>
<p>In our case this is fine since the app will be used only by one user and we are using TLS.</p>
<p>Echo has a middleware that does that for us.</p>
<p>Let's do the following:</p>
<pre><code class="lang-bash">go get <span class="hljs-string">"github.com/labstack/echo/v4/middleware"</span>
</code></pre>
<p>and then in our <code>router.go</code> modify the new method to work like:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1718467162797/9027a4d2-eb62-4583-b21b-7f36493f8382.png" alt class="image--center mx-auto" /></p>
<p>As you see in the image we need to store our username and password in the environment variables.</p>
<p>For local development add them in the <code>dev.env</code></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1718467222752/58ead675-94c9-47a6-b211-9b471a07b83b.png" alt class="image--center mx-auto" /></p>
<p>for production set them in your server <code>invoice-server/.env</code> file.</p>
<p><mark>USE SOMETHING SECURE</mark></p>
<p>Let's now test it</p>
<pre><code class="lang-bash">make dev
</code></pre>
<p>and then when you visit:</p>
<p><a target="_blank" href="https://local.freelance-invoice-hub.com">https://local.freelance-invoice-hub.com</a></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1718467565594/71ed06a3-186c-490d-8567-3f8aee42ec0a.png" alt class="image--center mx-auto" /></p>
<p>since it works let's now commit and deploy:</p>
<pre><code class="lang-bash">git add .
git commit -m <span class="hljs-string">"simple deploy script and basic auth"</span>
git push origin deploy-Makefile-auth
</code></pre>
<p>And let's deploy</p>
<pre><code class="lang-bash">make deploy
</code></pre>
<p>As usual you can find all the code in the related <a target="_blank" href="https://github.com/gosom/freelance-invoice-hub/tree/deploy-Makefile-auth">github branch</a></p>
<h3 id="heading-summary-and-whats-next">Summary and what's next</h3>
<p>Today we learned a simple way to deploy our Golang applications using Systemd.</p>
<p>This is a really basic method and we can improve our deployments a LOT.</p>
<p><mark>However, the scope of this series is to create a WORKING golang application and at the same time learn that our time is precious. We will do only what is required.<br />When we have an application working we will keep improving.</mark></p>
<p>Additionally, we learned how to use the Basic Auth Middleware that echo providers</p>
<p>In the next blog we will start coding our application. We will define our domain models and the operations on them.</p>
<p>Until then have fun.</p>
<p>Please comment if something is not clear or does not work for you and I will try to help</p>
<p>Also, don't forget to follow me on <a target="_blank" href="https://x.com/gkomdev">X</a></p>
]]></content:encoded></item><item><title><![CDATA[Building a Robust Web Server in Go: A Step-by-Step Guide]]></title><description><![CDATA[We continue our journey implementing a real world web application in Golang in this blog post.
In the previous post we setup our docker development environment that just prints hello world.
In this post we are going to setup a real webserver that it'...]]></description><link>https://blog.gkomninos.com/building-a-robust-web-server-in-go-a-step-by-step-guide</link><guid isPermaLink="true">https://blog.gkomninos.com/building-a-robust-web-server-in-go-a-step-by-step-guide</guid><category><![CDATA[coding]]></category><category><![CDATA[golang]]></category><category><![CDATA[webdev]]></category><category><![CDATA[Tutorial]]></category><category><![CDATA[Go Language]]></category><dc:creator><![CDATA[Georgios Komninos]]></dc:creator><pubDate>Sat, 15 Jun 2024 05:00:44 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1718299262472/ae3c3dca-e0d5-4389-bd68-6134212753cf.webp" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>We continue our journey implementing a real world web application in Golang in this blog post.</p>
<p>In the previous post we setup our docker development environment that just prints hello world.</p>
<p>In this post we are going to setup a real webserver that it's going to serve our web application.</p>
<p>We are going to learn how to:</p>
<ul>
<li><p>setup the golang http.Server</p>
</li>
<li><p>automatically obtain a TLS certificate using let's encrypt</p>
</li>
<li><p>create and HTTP handler using <a target="_blank" href="https://echo.labstack.com/">Echo Framework</a></p>
</li>
</ul>
<h3 id="heading-webserver">Webserver</h3>
<p>Golang offers a very capable HTTP server in the package <a target="_blank" href="https://pkg.go.dev/net/http#Server">http</a> . We are going to utilize that.</p>
<p>Le't first create a new git branch:</p>
<pre><code class="lang-bash">git checkout -b basic-webserver
</code></pre>
<p>Let's see our current project layout:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1718292277311/3340fffd-dd18-4b58-876f-94c2d07a1709.png" alt class="image--center mx-auto" /></p>
<p>We will put the code of our webserver in the http package. Let's not complicate things and create extra packages or abstractions for now.</p>
<p>In the <code>http/http.go</code></p>
<pre><code class="lang-go"><span class="hljs-keyword">package</span> http

<span class="hljs-keyword">import</span> (
    <span class="hljs-string">"context"</span>
    <span class="hljs-string">"crypto/tls"</span>
    <span class="hljs-string">"net/http"</span>
    <span class="hljs-string">"time"</span>
)

<span class="hljs-keyword">type</span> ServerParams <span class="hljs-keyword">struct</span> {
    Handler http.Handler
}

<span class="hljs-keyword">type</span> Server <span class="hljs-keyword">struct</span> {
    websrv *http.Server
}

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">New</span><span class="hljs-params">(params *ServerParams)</span> *<span class="hljs-title">Server</span></span> {
    ans := Server{
        websrv: &amp;http.Server{
            Addr:              <span class="hljs-string">":443"</span>,
            Handler:           params.Handler,
            ReadTimeout:       <span class="hljs-number">5</span> * time.Second,
            WriteTimeout:      <span class="hljs-number">10</span> * time.Second,
            IdleTimeout:       <span class="hljs-number">5</span> * time.Second,
            ReadHeaderTimeout: <span class="hljs-number">5</span> * time.Second,
            MaxHeaderBytes:    <span class="hljs-number">1</span> &lt;&lt; <span class="hljs-number">20</span>,
        },
    }

    <span class="hljs-keyword">return</span> &amp;ans
}

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-params">(s *Server)</span> <span class="hljs-title">Start</span><span class="hljs-params">(ctx context.Context)</span> <span class="hljs-title">error</span></span> {
    <span class="hljs-keyword">return</span> s.websrv.ListenAndServeTLS(<span class="hljs-string">""</span>, <span class="hljs-string">""</span>)
}
</code></pre>
<p>As you noticed all the settings are hardcoded for the moment. However, we have a params struct passed as an argument to the function so we can customize.</p>
<p>An interesting part is that our server listens to port 443 (TLS) but we haven't configured any SSL certificate.</p>
<p>Let's see the behavior</p>
<p>in <code>cmd/main.go</code> :</p>
<pre><code class="lang-go"><span class="hljs-keyword">package</span> main

<span class="hljs-keyword">import</span> (
    <span class="hljs-string">"context"</span>

    <span class="hljs-string">"invoicehub/http"</span>
)

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">main</span><span class="hljs-params">()</span></span> {
    ctx, cancel := context.WithCancel(context.Background())
    <span class="hljs-keyword">defer</span> cancel()

    params := http.ServerParams{}

    srv := http.New(&amp;params)

    err := srv.Start(ctx)
    <span class="hljs-keyword">if</span> err != <span class="hljs-literal">nil</span> {
        <span class="hljs-built_in">panic</span>(err)
    }
}
</code></pre>
<p>and then:</p>
<p>we need a valid SSL certificate.</p>
<p>We have actually two things to do:</p>
<ol>
<li><p>Make our server to automatically obtain the SSL certificate when we deploy to a server</p>
</li>
<li><p>Provide a valid certificate when we work locally</p>
</li>
</ol>
<p><strong>Valid local certificate</strong></p>
<p>Create a folder certs and add to gitignore</p>
<pre><code class="lang-bash">mkdir certs
<span class="hljs-built_in">echo</span> <span class="hljs-string">"certs/*"</span> &gt; .gitignore
</code></pre>
<p>Install the awesome <a target="_blank" href="https://github.com/FiloSottile/mkcert">mkcert tool</a> for your OS .</p>
<p>then add in your <code>Makefile</code></p>
<pre><code class="lang-bash">create-certs: <span class="hljs-comment">## generate self-signed certificates</span>
    @mkcert -install
    @mkcert -cert-file ./certs/local-cert.crt \
       -key-file ./certs/local-cert.key \
       local.freelance-invoice-hub.com localhost 127.0.0.1 ::1
</code></pre>
<p>And run <code>make certs</code></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1718296360925/bf505980-5e74-4b62-a819-907f3b6a2655.png" alt class="image--center mx-auto" /></p>
<p>the above created 2 files:<br /><code>./certs/local-cert.crt</code></p>
<p><code>./certs/local-cert.key</code></p>
<p>We need to modify our code to use these certs when we work locally.</p>
<p>We read an environment variable called <code>FIH_DOMAIN</code> , if this is empty then use the certificates we just created.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1718297538775/11e37cd7-aeb7-41f4-bbd7-88a2b29c3859.png" alt class="image--center mx-auto" /></p>
<p>(also import the <code>os</code> package in your imports)</p>
<p>Let's try that:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1718297615201/5d019a94-658a-4f47-abcd-e89d4b97e0ef.png" alt class="image--center mx-auto" /></p>
<p>So for localhost it works, let's also configure the domain<br />local.freelance-invoice-hub.com.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1718297681794/59616b91-1d11-4863-a553-106bee2d823d.png" alt class="image--center mx-auto" /></p>
<p>For this to work we need to add an entry on</p>
<p>On linux add an entry on <code>/etc/hosts</code> like:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1718297836326/8f36d076-1221-4f01-9e2f-21871d3d2f41.png" alt class="image--center mx-auto" /></p>
<p>for other operating systems do the equivalent as described <a target="_blank" href="https://www.howtogeek.com/27350/beginner-geek-how-to-edit-your-hosts-file/">here</a></p>
<p>and try again :</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1718297888613/e78b80b3-db74-4215-abaf-07b8cc35e2c4.png" alt class="image--center mx-auto" /></p>
<p>Of course the HTTP status code we get is perfectly valid since we haven't registered anything on our server yet.</p>
<p><strong>Obtain a TLS certificate automatically using Auto TLS</strong></p>
<p>To obtain automatically a valid TLS certificate using Golang we are going to install two packages:</p>
<p>When in your root directory do :</p>
<pre><code class="lang-bash">go get <span class="hljs-string">"golang.org/x/crypto/acme"</span> <span class="hljs-string">"golang.org/x/crypto/acme/autocert"</span>
</code></pre>
<p>This will install the 2 packages . Actually these two package are part of the Go project but they are in different repos. That's why they are prefixed with an /x/ .</p>
<p>Now add a method in the Server struct:</p>
<pre><code class="lang-go"><span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-params">(s *Server)</span> <span class="hljs-title">setupAutoTLS</span><span class="hljs-params">(domains []<span class="hljs-keyword">string</span>)</span></span> {
    <span class="hljs-keyword">const</span> defaultCertCache = <span class="hljs-string">"/.cache/.certs"</span>
    autoTLSManager := autocert.Manager{
        Prompt:     autocert.AcceptTOS,
        Cache:      autocert.DirCache(defaultCertCache),
        HostPolicy: autocert.HostWhitelist(domains...),
    }
    <span class="hljs-comment">// https://ssl-config.mozilla.org/#server=go&amp;version=1.22.0&amp;config=intermediate&amp;guideline=5.7</span>
    s.websrv.TLSConfig = &amp;tls.Config{
        MinVersion:               tls.VersionTLS12,
        PreferServerCipherSuites: <span class="hljs-literal">true</span>,
        CipherSuites: []<span class="hljs-keyword">uint16</span>{
            tls.TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,
            tls.TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,
            tls.TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,
            tls.TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,
            tls.TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,
            tls.TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,
        },
        CurvePreferences: []tls.CurveID{
            tls.CurveP256,
            tls.X25519,
        },
        GetCertificate: autoTLSManager.GetCertificate,
        NextProtos: []<span class="hljs-keyword">string</span>{
            <span class="hljs-string">"h2"</span>, <span class="hljs-string">"http/1.1"</span>, <span class="hljs-comment">// enable HTTP/2</span>
            acme.ALPNProto, <span class="hljs-comment">// enable tls-alpn ACME challenges</span>
        },
    }
}
</code></pre>
<p>and don't forget to import on top</p>
<pre><code class="lang-go">    <span class="hljs-string">"golang.org/x/crypto/acme"</span>
    <span class="hljs-string">"golang.org/x/crypto/acme/autocert"</span>
</code></pre>
<p>Important note:</p>
<p>When I created the TLS configuration I selected some CipherSuites and Elliptic Curves. I used the ones in the code since they are the recommended ones by</p>
<p>👉 <a target="_blank" href="https://ssl-config.mozilla.org/#server=go&amp;version=1.22.0&amp;config=intermediate&amp;guideline=5.7">mozilla</a>.</p>
<p>In order to test that we need a domain and setup the Nameservers to point to our hosting server. In the next blog post we are going to deploy what we have so far in a real webserver. For now let's continue.</p>
<p><strong>What is missing</strong></p>
<p>Our server now does not gracefully exits when a SIGTERM is received.<br />We will leave it for now. It's good enough but we should revisit .</p>
<h3 id="heading-lets-finally-make-our-server-do-something-useful">Let's finally make our server do something useful</h3>
<p>We will finally add an HTTP handler that returns something super useful .</p>
<p>What's better from a <code>HELLO WORLD</code> that we can see in our browser 🚀</p>
<p>First install the <a target="_blank" href="https://echo.labstack.com/">echo</a> framework.</p>
<p>Wait a second why Echo? Echo makes our life a little bit easier without being huge and does not go into our way a lot. Another very good choice is <a target="_blank" href="https://go-chi.io/#/">chi</a> . Basically I don't know, there are so many frameworks and benchmarks out there just pick Echo.</p>
<p>In any case, the way we are going to code our application we should be able to switch frameworks without having to rewrite everything (but let's avoid that).</p>
<p>So, install echo:</p>
<pre><code class="lang-bash">go get github.com/labstack/<span class="hljs-built_in">echo</span>/v4
</code></pre>
<p>and then create a file <code>http/router.go</code></p>
<pre><code class="lang-go"><span class="hljs-keyword">package</span> http

<span class="hljs-keyword">import</span> (
    <span class="hljs-string">"net/http"</span>

    <span class="hljs-string">"github.com/labstack/echo/v4"</span>
)

<span class="hljs-keyword">type</span> Router <span class="hljs-keyword">struct</span> {
    e *echo.Echo
}

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">NewRouter</span><span class="hljs-params">()</span> *<span class="hljs-title">Router</span></span> {
    ans := Router{
        e: echo.New(),
    }

    ans.e.Debug = <span class="hljs-literal">true</span>

    baseHandler := baseHandler{
        e: ans.e,
    }

    handlers := []handler{
        &amp;baseHandler,
    }

    <span class="hljs-keyword">for</span> _, h := <span class="hljs-keyword">range</span> handlers {
        h.RegisterRoutes()
    }

    <span class="hljs-keyword">return</span> &amp;ans
}

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-params">(r *Router)</span> <span class="hljs-title">Handler</span><span class="hljs-params">()</span> <span class="hljs-title">http</span>.<span class="hljs-title">Handler</span></span> {
    <span class="hljs-keyword">return</span> r.e
}

<span class="hljs-keyword">type</span> handler <span class="hljs-keyword">interface</span> {
    RegisterRoutes()
}

<span class="hljs-keyword">type</span> baseHandler <span class="hljs-keyword">struct</span> {
    e *echo.Echo
}

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-params">(b *baseHandler)</span> <span class="hljs-title">RegisterRoutes</span><span class="hljs-params">()</span></span> {
    b.e.GET(<span class="hljs-string">"/"</span>, <span class="hljs-function"><span class="hljs-keyword">func</span><span class="hljs-params">(c echo.Context)</span> <span class="hljs-title">error</span></span> {
        <span class="hljs-keyword">return</span> c.String(http.StatusOK, <span class="hljs-string">"HELLO WORLD"</span>)
    })
}
</code></pre>
<p>Here we just created an echo.Echo instance and attached a function to run<br />when we visit the path <code>/</code> on the server .</p>
<p>If you follow the tutorial so far then visit:</p>
<pre><code class="lang-bash">make dev
</code></pre>
<p>and visit:</p>
<p><a target="_blank" href="https://local.freelance-invoice-hub.com">https://local.freelance-invoice-hub.com</a></p>
<p>or</p>
<p><a target="_blank" href="https://local.freelance-invoice-hub.com">https://localhost</a></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1718301990947/15be1b7b-c71a-4f05-979d-f29d993e6b45.png" alt class="image--center mx-auto" /></p>
<p>🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥🔥</p>
<h3 id="heading-commit">Commit</h3>
<p>As usual we should commit our branch and you can find it in :<br />the github <a target="_blank" href="https://github.com/gosom/freelance-invoice-hub/tree/basic-webserver">branch</a></p>
<h3 id="heading-summary-and-whats-next">Summary and what's next</h3>
<p>In today's blog we learned how to create a webserver in Go, how to attach a router (echo in our case). Additionally we learned how to utilize golang's features to obtain automatically TLS certificates using Let's Encrypt (I forgot to mention that Go takes care of certificate renewal automatically). We also created valid TSL certificates for local development.</p>
<p>So far in the series we are just building our foundation, kind of our own framework to ease and speedup our development procedure. Most of the code we wrote is the same for many web apps and we could extract some libraries out of it or a template that we use in other projects.</p>
<p>In the next blog post we are going to add Basic authentication and show you how you can deploy the application in a cheap VPS. We wil also configure github CI/CD when we merge to main branch.</p>
<p>That's all for today.</p>
<p>❓ If you have any questions or something is not running on your machine I am happy to help. Reach out via a comment and I will try to help</p>
<p>❤️ Please subscribe to my newsletter and follow me on <a target="_blank" href="https://x.com/gkomdev">X</a> or <a target="_blank" href="https://www.linkedin.com/in/georgios-komninos-172508147/">LinkedIn</a> .</p>
]]></content:encoded></item><item><title><![CDATA[Setting up a Docker development enviroment for Go]]></title><description><![CDATA[Welcome back to our Golang web application series! If you're excited to learn web development with Golang, you're in the right place. In this guide, we'll set up a solid development environment. By the end, you'll have a well-organized project and be...]]></description><link>https://blog.gkomninos.com/setting-up-a-docker-development-enviroment-for-go</link><guid isPermaLink="true">https://blog.gkomninos.com/setting-up-a-docker-development-enviroment-for-go</guid><category><![CDATA[coding]]></category><category><![CDATA[Tutorial]]></category><category><![CDATA[golang]]></category><category><![CDATA[Web Development]]></category><category><![CDATA[Go Language]]></category><dc:creator><![CDATA[Georgios Komninos]]></dc:creator><pubDate>Fri, 14 Jun 2024 05:00:08 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1718256461382/938e82bd-4424-46fa-aaac-2054b042ce4f.webp" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Welcome back to our <a target="_blank" href="https://blog.gkomninos.com/series/webapp-using-golang">Golang web application</a> series! If you're excited to learn web development with Golang, you're in the right place. In this guide, we'll set up a solid development environment. By the end, you'll have a well-organized project and be able to run your code locally with Docker Compose, complete with hot reloading for easy development. Let's get started!  </p>
<h3 id="heading-prerequisites">Prerequisites</h3>
<p>Ensure that you have:<br />- <a target="_blank" href="https://docs.docker.com/get-docker/">docker</a><br />- <a target="_blank" href="https://docs.docker.com/compose/">https://docs.docker.com/compose/</a><br />- <a target="_blank" href="https://go.dev/">golang</a></p>
<p>installed</p>
<h3 id="heading-basic-setup">Basic setup</h3>
<p>Create a folder that will host your project and init a git repository</p>
<pre><code class="lang-bash">mkdir freelance-invoice-hub
<span class="hljs-built_in">cd</span> freelance-invoice-hub
git init
</code></pre>
<p>Let's create a new git branch</p>
<pre><code class="lang-bash">git checkout -b project-skeleton
</code></pre>
<p>Now we need to create a go module:</p>
<pre><code class="lang-bash">go mod init invoicehub
</code></pre>
<p>Also let's create a .gitignore file and a README and LICENCE and Makefile .<br />Let's keep the empty for now</p>
<pre><code class="lang-bash">touch README.md
touch .gitignore
touch LICENCE
touch Makefile
</code></pre>
<h3 id="heading-golang-project-structure">Golang Project structure</h3>
<p>We need to organize our project skeleton in a way that we can easily extend and test our code.<br />Below you can see a good starting point for your go projects.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1718258441751/39a0194e-3520-4b0f-beb9-3771aeff1465.png" alt class="image--center mx-auto" /></p>
<p>Let's explain the project layout and what we will put in each folder</p>
<p><strong>cmd:</strong> Here we will have our executable, the main function.<br />Right now it just prints hello world :)</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1718258541093/c219bd5c-3618-4f6c-8df4-474c294a0598.png" alt class="image--center mx-auto" /></p>
<p><strong>http:</strong> In this package we are going to add all the code that is related with the HTTP server, the handlers etc. For now just create an empty file http.go</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1718258601783/9934c1c6-812a-42db-b79a-16075060a824.png" alt class="image--center mx-auto" /></p>
<p><strong>mocks:</strong> This folder will hold the mock implementation of our services. The mocks will be auto-generated. More on that later. for now just add a mocks.go which only declares that this is a go package.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1718258697624/a88db52c-5804-469e-8636-c2b298066c10.png" alt class="image--center mx-auto" /></p>
<p><strong>sqlite</strong>: This folder will contain the implementation of our database layer. Since we are going to use <a target="_blank" href="https://sqlite.org/">sqlite</a>, we name that package sqlite. Similarly just create a file just declaring the package name.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1718258776676/6b479278-a984-4cb8-967f-6e0474950a98.png" alt class="image--center mx-auto" /></p>
<p>In the root folder we are going to create .go files that represent our domain entities and their operations. For now let's do nothing</p>
<p>Let's test that our main runs now:</p>
<pre><code class="lang-bash">go run cmd/main.go
</code></pre>
<p>This must print <em>hello world .</em></p>
<p>If not please check the error message and your go installation.<br />In case you run into problems please reach to me either on <a target="_blank" href="https://x.com/gkomdev">Twitter</a> or in a comment and I will assist.</p>
<h3 id="heading-setting-up-dockerfile-and-docker-compose">Setting up Dockerfile and docker-compose</h3>
<p>We want to be able to run our application in a docker container. Additionally, the docker container should reload when the code changes (hot reload).</p>
<p>Let's do that 🚀</p>
<pre><code class="lang-bash">touch dev.Dockerfile
</code></pre>
<p>and paste the following  </p>
<pre><code class="lang-dockerfile"><span class="hljs-keyword">FROM</span> golang:<span class="hljs-number">1.22</span>.<span class="hljs-number">1</span>-bullseye

<span class="hljs-keyword">RUN</span><span class="bash"> apt-get update \
    &amp;&amp; apt-get install -y ca-certificates curl gnupg \
    &amp;&amp; mkdir -p /etc/apt/keyrings</span>

<span class="hljs-keyword">WORKDIR</span><span class="bash"> /app</span>

<span class="hljs-keyword">RUN</span><span class="bash"> go install go.uber.org/mock/mockgen@latest &amp;&amp; \
    go install github.com/air-verse/air@latest</span>


<span class="hljs-keyword">RUN</span><span class="bash"> git config --global --add safe.directory /app</span>

<span class="hljs-keyword">CMD</span><span class="bash"> [<span class="hljs-string">"air"</span>, <span class="hljs-string">"-c"</span>, <span class="hljs-string">".air.toml"</span>]</span>
</code></pre>
<p>Notes:</p>
<p>We install in the docker container 2 extra go packages:  </p>
<p>mockgen: It will be used to generate mock implementation automatically from our interfaces.</p>
<p>air: This is a program that monitors your files for changes and restart the process. This is the tool that we are going to use for "hot" reloading the code.  </p>
<p>Air requires a <code>.air.toml</code> file with it's configuration.<br />Create that file using <code>touch .air.toml</code> and paste the following:</p>
<pre><code class="lang-yaml"><span class="hljs-comment"># Config file for [Air](https://github.com/cosmtrek/air) in TOML format</span>

<span class="hljs-comment"># Working directory</span>
<span class="hljs-comment"># . or absolute path, please note that the directories following must be under root.</span>
<span class="hljs-string">root</span> <span class="hljs-string">=</span> <span class="hljs-string">"."</span>
<span class="hljs-string">tmp_dir</span> <span class="hljs-string">=</span> <span class="hljs-string">"tmp"</span>

[<span class="hljs-string">build</span>]
<span class="hljs-comment"># Array of commands to run before each build</span>
<span class="hljs-string">pre_cmd</span> <span class="hljs-string">=</span> [<span class="hljs-string">"go mod download"</span>]
<span class="hljs-comment"># Just plain old shell command. You could use `make` as well.</span>
<span class="hljs-string">cmd</span> <span class="hljs-string">=</span> <span class="hljs-string">"go build -o ./tmp/freelance-invoice-hub cmd/main.go"</span>
<span class="hljs-comment"># Array of commands to run after ^C</span>
<span class="hljs-comment">#post_cmd = ["echo 'hello air' &gt; post_cmd.txt"]</span>
<span class="hljs-comment"># Binary file yields from `cmd`.</span>
<span class="hljs-string">bin</span> <span class="hljs-string">=</span> <span class="hljs-string">"tmp/freelance-invoice-hub"</span>
<span class="hljs-comment"># Customize binary, can setup environment variables when run your app.</span>
<span class="hljs-comment">#full_bin = "./tmp/main"</span>
<span class="hljs-string">full_bin</span> <span class="hljs-string">=</span> <span class="hljs-string">"export $(grep -v '^#' dev.env | xargs);tmp/freelance-invoice-hub"</span>
<span class="hljs-comment"># Watch these filename extensions.</span>
<span class="hljs-string">include_ext</span> <span class="hljs-string">=</span> [<span class="hljs-string">"go"</span>, <span class="hljs-string">"tpl"</span>, <span class="hljs-string">"tmpl"</span>, <span class="hljs-string">"html"</span>, <span class="hljs-string">"js"</span>, <span class="hljs-string">"css"</span>, <span class="hljs-string">"scss"</span>, <span class="hljs-string">"toml"</span>]
<span class="hljs-comment"># Ignore these filename extensions or directories.</span>
<span class="hljs-string">exclude_dir</span> <span class="hljs-string">=</span> [<span class="hljs-string">"bin"</span>, <span class="hljs-string">"tmp"</span>, <span class="hljs-string">"vendor"</span>, <span class="hljs-string">"certs"</span>, <span class="hljs-string">"static"</span>, <span class="hljs-string">"uploads"</span>]
<span class="hljs-comment"># Watch these directories if you specified.</span>
<span class="hljs-string">include_dir</span> <span class="hljs-string">=</span> []
<span class="hljs-comment"># Watch these files.</span>
<span class="hljs-string">include_file</span> <span class="hljs-string">=</span> []
<span class="hljs-comment"># Exclude files.</span>
<span class="hljs-string">exclude_file</span> <span class="hljs-string">=</span> []
<span class="hljs-comment"># Exclude specific regular expressions.</span>
<span class="hljs-string">exclude_regex</span> <span class="hljs-string">=</span> [<span class="hljs-string">"_test\\.go"</span>, <span class="hljs-string">"mock_.*\\.go"</span>, <span class="hljs-string">"gomock_.*\\.go"</span>]
<span class="hljs-comment"># Exclude unchanged files.</span>
<span class="hljs-string">exclude_unchanged</span> <span class="hljs-string">=</span> <span class="hljs-literal">true</span>
<span class="hljs-comment"># Follow symlink for directories</span>
<span class="hljs-string">follow_symlink</span> <span class="hljs-string">=</span> <span class="hljs-literal">true</span>
<span class="hljs-comment"># This log file places in your tmp_dir.</span>
<span class="hljs-string">log</span> <span class="hljs-string">=</span> <span class="hljs-string">"air.log"</span>
<span class="hljs-comment"># Poll files for changes instead of using fsnotify.</span>
<span class="hljs-string">poll</span> <span class="hljs-string">=</span> <span class="hljs-literal">false</span>
<span class="hljs-comment"># Poll interval (defaults to the minimum interval of 500ms).</span>
<span class="hljs-string">poll_interval</span> <span class="hljs-string">=</span> <span class="hljs-number">500</span> <span class="hljs-comment"># ms</span>
<span class="hljs-comment"># It's not necessary to trigger build each time file changes if it's too frequent.</span>
<span class="hljs-string">delay</span> <span class="hljs-string">=</span> <span class="hljs-number">1000</span> <span class="hljs-comment"># ms</span>
<span class="hljs-comment"># Stop running old binary when build errors occur.</span>
<span class="hljs-string">stop_on_error</span> <span class="hljs-string">=</span> <span class="hljs-literal">true</span>
<span class="hljs-comment"># Send Interrupt signal before killing process (windows does not support this feature)</span>
<span class="hljs-string">send_interrupt</span> <span class="hljs-string">=</span> <span class="hljs-literal">true</span>
<span class="hljs-comment"># Delay after sending Interrupt signal</span>
<span class="hljs-string">kill_delay</span> <span class="hljs-string">=</span> <span class="hljs-number">500</span> <span class="hljs-comment"># nanosecond</span>
<span class="hljs-comment"># Rerun binary or not</span>
<span class="hljs-string">rerun</span> <span class="hljs-string">=</span> <span class="hljs-literal">false</span>
<span class="hljs-comment"># Delay after each executions</span>
<span class="hljs-string">rerun_delay</span> <span class="hljs-string">=</span> <span class="hljs-number">500</span>
<span class="hljs-comment"># Add additional arguments when running binary (bin/full_bin). Will run './tmp/main hello world'.</span>
<span class="hljs-comment">#args_bin = ["hello", "world"]</span>

[<span class="hljs-string">log</span>]
<span class="hljs-comment"># Show log time</span>
<span class="hljs-string">time</span> <span class="hljs-string">=</span> <span class="hljs-literal">false</span>
<span class="hljs-comment"># Only show main log (silences watcher, build, runner)</span>
<span class="hljs-string">main_only</span> <span class="hljs-string">=</span> <span class="hljs-literal">false</span>

[<span class="hljs-string">color</span>]
<span class="hljs-comment"># Customize each part's color. If no color found, use the raw app log.</span>
<span class="hljs-string">main</span> <span class="hljs-string">=</span> <span class="hljs-string">"magenta"</span>
<span class="hljs-string">watcher</span> <span class="hljs-string">=</span> <span class="hljs-string">"cyan"</span>
<span class="hljs-string">build</span> <span class="hljs-string">=</span> <span class="hljs-string">"yellow"</span>
<span class="hljs-string">runner</span> <span class="hljs-string">=</span> <span class="hljs-string">"green"</span>

[<span class="hljs-string">misc</span>]
<span class="hljs-comment"># Delete tmp directory on exit</span>
<span class="hljs-string">clean_on_exit</span> <span class="hljs-string">=</span> <span class="hljs-literal">true</span>

[<span class="hljs-string">screen</span>]
<span class="hljs-string">clear_on_rebuild</span> <span class="hljs-string">=</span> <span class="hljs-literal">true</span>
<span class="hljs-string">keep_scroll</span> <span class="hljs-string">=</span> <span class="hljs-literal">true</span>
</code></pre>
<p>Basically, you get that <a target="_blank" href="https://github.com/air-verse/air/blob/master/air_example.toml">file</a> and you modify it. We are not going to explain more here.<br />If you want to understand what it's changed from the original focus on the [Build] section. I just changed the paths more or less to match our project.  </p>
<p>Finally add the tmp folder in your .gitignore file:  </p>
<pre><code class="lang-bash"><span class="hljs-built_in">echo</span> <span class="hljs-string">"tmp"</span> &gt;&gt; .gitignore
</code></pre>
<p><strong>Docker-compose</strong></p>
<p>to run our project we are going to use docker compose.</p>
<p>create a file <code>touch dev.docker-compose.yaml</code> and paste the following:</p>
<pre><code class="lang-yaml"><span class="hljs-attr">services:</span>
  <span class="hljs-attr">app:</span>
    <span class="hljs-attr">build:</span>
      <span class="hljs-attr">context:</span> <span class="hljs-string">.</span>
      <span class="hljs-attr">dockerfile:</span> <span class="hljs-string">dev.Dockerfile</span>
    <span class="hljs-attr">env_file:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">'dev.env'</span>
    <span class="hljs-attr">ports:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">'127.0.0.1:443:443'</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">'127.0.0.1:80:8080'</span>
    <span class="hljs-attr">extra_hosts:</span>
        <span class="hljs-bullet">-</span> <span class="hljs-string">"local.freelance-invoice-hub.com:127.0.0.1"</span>
    <span class="hljs-attr">volumes:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">.:/app</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">freelance-invoice-hub_mod_cache:/go/pkg/mod</span>

<span class="hljs-attr">volumes:</span>
  <span class="hljs-attr">freelance-invoice-hub_mod_cache:</span>
</code></pre>
<p>Finally create a file <code>touch dev.env</code> . We are going to add environment variables for development there later.</p>
<p>Let's test the setup</p>
<pre><code class="lang-bash">docker compose -f dev.docker-compose.yaml up
</code></pre>
<p>then it should start building the image and start the container.</p>
<p>you should see something like the below</p>
<pre><code class="lang-bash">[+] Running 1/0
 ✔ Container freelance-invoice-hub-app-1  Created                                                                                                                                                                                        0.0s
Attaching to app-1
app-1  |
app-1  |   __    _   ___
app-1  |  / /\  | | | |_)
app-1  | /_/--\ |_| |_| \_ v1.52.2, built with Go go1.22.1
app-1  |
app-1  | mkdir /app/tmp
app-1  | watching .
app-1  | watching cmd
app-1  | watching http
app-1  | watching mocks
app-1  | watching sqlite
app-1  | !exclude tmp
app-1  | &gt; go mod download
app-1  | go: no module dependencies to download
app-1  | building...
app-1  | running...
app-1  | <span class="hljs-built_in">export</span> GOLANG_VERSION=<span class="hljs-string">'1.22.1'</span>
app-1  | <span class="hljs-built_in">export</span> GOPATH=<span class="hljs-string">'/go'</span>
app-1  | <span class="hljs-built_in">export</span> GOTOOLCHAIN=<span class="hljs-string">'local'</span>
app-1  | <span class="hljs-built_in">export</span> HOME=<span class="hljs-string">'/root'</span>
app-1  | <span class="hljs-built_in">export</span> HOSTNAME=<span class="hljs-string">'5a8638cf6ceb'</span>
app-1  | <span class="hljs-built_in">export</span> PATH=<span class="hljs-string">'/go/bin:/usr/local/go/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin'</span>
app-1  | <span class="hljs-built_in">export</span> PWD=<span class="hljs-string">'/app'</span>
app-1  | hello world
app-1  | Process Exit with Code 0
</code></pre>
<p>Notice the <code>hello world</code> in the end. This is what our program does.<br />Now let's test if it will reload when we do a change.  </p>
<p>In <code>cmd/main.go</code> change the <code>hello world</code> string to <code>hello hot reload</code> .</p>
<p>Now the console that started the container automatically run the modified code:</p>
<pre><code class="lang-bash">         cmd/main.go has changed
app-1  | &gt; go mod download
app-1  | go: no module dependencies to download
app-1  | building...
app-1  | running...
app-1  | <span class="hljs-built_in">export</span> GOLANG_VERSION=<span class="hljs-string">'1.22.1'</span>
app-1  | <span class="hljs-built_in">export</span> GOPATH=<span class="hljs-string">'/go'</span>
app-1  | <span class="hljs-built_in">export</span> GOTOOLCHAIN=<span class="hljs-string">'local'</span>
app-1  | <span class="hljs-built_in">export</span> HOME=<span class="hljs-string">'/root'</span>
app-1  | <span class="hljs-built_in">export</span> HOSTNAME=<span class="hljs-string">'5a8638cf6ceb'</span>
app-1  | <span class="hljs-built_in">export</span> PATH=<span class="hljs-string">'/go/bin:/usr/local/go/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin'</span>
app-1  | <span class="hljs-built_in">export</span> PWD=<span class="hljs-string">'/app'</span>
app-1  | hello hot reload
</code></pre>
<p><strong>Makefile</strong></p>
<p>Since it's tedious to write the <code>docker compose -f dev.docker-compose.yaml up</code> all the time<br />let's add a command for that in our Makefile</p>
<pre><code class="lang-bash">default: <span class="hljs-built_in">help</span>

<span class="hljs-built_in">help</span>: <span class="hljs-comment">## help information about make commands</span>
    @grep -E <span class="hljs-string">'^[a-zA-Z_-]+:.*?## .*$$'</span> $(MAKEFILE_LIST) | sort | awk <span class="hljs-string">'BEGIN {FS = ":.*?## "}; {printf "\033[36m%-30s\033[0m %s\n", $$1, $$2}'</span>

dev: <span class="hljs-comment">## runs the application in development mode</span>
    @docker compose -f dev.docker-compose.yaml up
</code></pre>
<p>Now run</p>
<p><code>make</code></p>
<p>and you should see:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1718261142919/0b0a5ff2-4d41-4de8-b1d7-97d10786bb1b.png" alt class="image--center mx-auto" /></p>
<p>And you can run your docker development container by using:  </p>
<pre><code class="lang-bash">make dev
</code></pre>
<p>-&gt;</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1718261311806/1fbd2194-6d65-4f18-8198-b29d5b1c63bc.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-commit">Commit</h3>
<p>It's time for us to commit and merge to the main branch  </p>
<pre><code class="lang-bash">git add .
git commit -m <span class="hljs-string">"Initial project setup"</span>
git checkout main
git merge project-skeleton
</code></pre>
<h3 id="heading-ia"> </h3>
<p>Summary</p>
<p>In this blog post, we have successfully set up the basic skeleton for our Golang-based web application. We created a structured project layout, set up essential files, and configured Docker and Docker Compose for local development with hot reloading. With this foundation in place, we are now ready to start building and extending our application.  </p>
<p>The code is accessible in the <code>project-skeleton</code> brach in <a target="_blank" href="https://github.com/gosom/freelance-invoice-hub/tree/project-skeleton">github</a> .</p>
<p>Stay tuned for the next part of the series, where we will dive deeper into implementing core functionalities. Happy coding!</p>
<div data-node-type="callout">
<div data-node-type="callout-emoji">💡</div>
<div data-node-type="callout-text">Please comment or ask on <a target="_blank" href="https://x.com/gkomdev">Twitter</a> if you have issues or need extra help. Happy to help</div>
</div>

<div data-node-type="callout">
<div data-node-type="callout-emoji">💡</div>
<div data-node-type="callout-text">If you found this guide helpful, please consider subscribing to our newsletter for more tutorials and updates. And don't forget to share this article with others who might benefit from it!</div>
</div>]]></content:encoded></item><item><title><![CDATA[Crafting a Web Application with Golang: A Step-by-Step Guide]]></title><description><![CDATA[Introduction
This is the first part the series Build a Web App with Golang . I am going to show you how you can build a web application using Golang. In fact, we are going to build an application that I need.
I this blog post we will cover the follow...]]></description><link>https://blog.gkomninos.com/crafting-a-web-application-with-golang-a-step-by-step-guide</link><guid isPermaLink="true">https://blog.gkomninos.com/crafting-a-web-application-with-golang-a-step-by-step-guide</guid><category><![CDATA[coding]]></category><category><![CDATA[golang]]></category><category><![CDATA[Go Language]]></category><category><![CDATA[Tutorial]]></category><category><![CDATA[Web Development]]></category><dc:creator><![CDATA[Georgios Komninos]]></dc:creator><pubDate>Thu, 13 Jun 2024 05:00:14 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1718256022312/d5e71b12-b803-4759-821a-573877b1d061.webp" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-introduction">Introduction</h2>
<p>This is the first part the series <a target="_blank" href="https://blog.gkomninos.com/series/webapp-using-golang">Build a Web App with Golang</a> . I am going to show you how you can build a web application using Golang. In fact, we are going to build an application that I need.</p>
<p>I this blog post we will cover the following:</p>
<ul>
<li>Scope, meaning we will define what we want to build</li>
</ul>
<p>The goal of this blog post is to understand the problem and it's domain. This will help us to understand what we need to code.</p>
<p>I will try to keep each blog post in a separate git branch. I believe this will make it easier for you to follow the tutorial.</p>
<h3 id="heading-scope">Scope</h3>
<blockquote>
<p>As a freelancer I want a web application in which I will login and I can create invoices for my clients.</p>
<p>I want to be able to view the invoices created and download a PDF version</p>
</blockquote>
<p>Let's breakdown the above requirements one by one.</p>
<p>First, let's see what an invoice contains:</p>
<ul>
<li><p>Freelancer's details which include:<br />  Company Name<br />  Company Address<br />  Email<br />  Tax Number<br />  VAT Number<br />  Bank Accounts</p>
</li>
<li><p>Client details which include:<br />  Company Name<br />  Company Address<br />  Email<br />  VAT number</p>
<ul>
<li><p>Invoice Details:<br />  Invoice Number<br />  Invoice Date<br />  Line Items</p>
<p>  Here a sample invoice:</p>
<p>  <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1718209842815/916943cc-1bd4-4599-ad62-9e3fdb566638.png" alt class="image--center mx-auto" /></p>
<p>  A Line Item, consists of:<br />  - Description<br />  - Disbursements<br />  - Fees</p>
<p>  A Bank Account, consists of:<br />  - Bank Name<br />  - Account No.<br />  - IBAN<br />  - BIC</p>
</li>
</ul>
</li>
</ul>
<p>    We also need to have the VAT rate for our country and the amount of days that the invoice is payable within.</p>
<p>    Let's now discuss the invoice number. This should be unique per invoice and in most of the cases it must be sequential. For my case the invoice number has the form for example 1030/24 which means that the invoice belongs to 2024 year and it has the number 1030. The next invoice should be 1031/24 and so on.<br />    This generation should be automated.</p>
<p>Since this is an MVP and it should work only for one freelancer we will use HTTP Basic Authentication and the UI will be really minimal</p>
<h3 id="heading-pages">Pages</h3>
<p>Since we will build a web application let's define the pages that we want to build.<br />That's a draft but it will help us to get started.</p>
<p>- / : this is the homepage of the application. It should display a list of all the invoices created with pagination. Next to each invoice we want to have a button to view its details.<br />- /settings: Here we will display our company's details<br />-/settings/edit: A form to edit our company's details<br />- /clients: We will display a list of our clients<br />- /clients/id: It will display the details for the client with id id<br />- /clients/id/edit: a form to edit the client's details with id<br />- /clients/new: a form to add a new client<br />- /invoices/id/download: this will download the invoice with id id<br />- /invoices/id: we will view the details of an invoice<br />- /invoices/new: a form to create a new invoice</p>
<h3 id="heading-technologies-that-we-will-use">Technologies that we will use</h3>
<p>We need to store our data somewhere. For simplicity and easy of deployment we are going to use the awesome <a target="_blank" href="https://sqlite.org/">SQLite</a> DBMS.</p>
<p>This post is all about Golang so that's the back-end language. However we are going to use the <a target="_blank" href="https://echo.labstack.com/">Echo Framework</a> and <a target="_blank" href="https://gorm.io/index.html">GORM</a> to speedup development.</p>
<p>For the front-end I am not sure yet, most likely we are going to use <a target="_blank" href="https://getbootstrap.com/">Bootstrap</a> . I am also pretty sure that we are going to use ChatGPT to write the basic HTML and CSS for us.</p>
<h3 id="heading-conclusion">Conclusion</h3>
<p>In the blog post we will dive into coding. We will create our git repository and setup some basics tooling that will save us a lot of time while we are developing.  </p>
<p>I expect to release the next blog post of the tutorial every day.  </p>
<p>If you have any questions or requests please write a comment.  </p>
<p>Please subscribe to my newsletter so you get the updates.  </p>
<p>Finally, just today I created an <a target="_blank" href="https://x.com/gkomdev">X Account</a> and I would love to start having some followers.</p>
]]></content:encoded></item><item><title><![CDATA[Use context in your HTTP handlers]]></title><description><![CDATA[Let's consider a simple webserver that runs a task. The task can be anything like a time consuming computation, a database query.
You need to be able to do two things:

when the client drops the connection while the task is running terminate the task...]]></description><link>https://blog.gkomninos.com/use-context-in-your-http-handlers</link><guid isPermaLink="true">https://blog.gkomninos.com/use-context-in-your-http-handlers</guid><category><![CDATA[Go Language]]></category><category><![CDATA[coding]]></category><category><![CDATA[best practices]]></category><category><![CDATA[technology]]></category><dc:creator><![CDATA[Georgios Komninos]]></dc:creator><pubDate>Wed, 12 Jun 2024 06:09:32 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1718173017244/724f161e-cc89-496b-82d4-07bb986ff551.webp" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Let's consider a simple webserver that runs a task. The task can be anything like a time consuming computation, a database query.</p>
<p>You need to be able to do two things:</p>
<ul>
<li><p>when the client drops the connection while the task is running terminate the task so resources are released.</p>
</li>
<li><p>when the task takes too much time return a proper HTTP status code to the client</p>
</li>
</ul>
<p>Let's see a practical example</p>
<pre><code class="lang-go"><span class="hljs-keyword">package</span> main

<span class="hljs-keyword">import</span> (
    <span class="hljs-string">"context"</span>
    <span class="hljs-string">"errors"</span>
    <span class="hljs-string">"fmt"</span>
    <span class="hljs-string">"math/rand"</span>
    <span class="hljs-string">"net/http"</span>
    <span class="hljs-string">"time"</span>
)

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">main</span><span class="hljs-params">()</span></span> {
    mux := http.NewServeMux()

    <span class="hljs-comment">// Register the handler with the middleware, setting the timeout to 5 seconds</span>
    mux.HandleFunc(<span class="hljs-string">"/"</span>, longRunningTaskHandler)

    <span class="hljs-keyword">const</span> timeout = time.Second * <span class="hljs-number">5</span>
    wrappedMux := contextMiddleware(timeout)(mux)

    <span class="hljs-comment">// Start the server</span>
    http.ListenAndServe(<span class="hljs-string">":8080"</span>, wrappedMux)
}

<span class="hljs-comment">// contextMiddleware is a middleware that sets a timeout for the request context.</span>
<span class="hljs-comment">// If the request takes longer than the timeout, the middleware will cancel the context.</span>
<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">contextMiddleware</span><span class="hljs-params">(timeout time.Duration)</span> <span class="hljs-title">func</span><span class="hljs-params">(http.Handler)</span> <span class="hljs-title">http</span>.<span class="hljs-title">Handler</span></span> {
    <span class="hljs-keyword">return</span> <span class="hljs-function"><span class="hljs-keyword">func</span><span class="hljs-params">(next http.Handler)</span> <span class="hljs-title">http</span>.<span class="hljs-title">Handler</span></span> {
        <span class="hljs-keyword">return</span> http.HandlerFunc(<span class="hljs-function"><span class="hljs-keyword">func</span><span class="hljs-params">(w http.ResponseWriter, r *http.Request)</span></span> {
            <span class="hljs-comment">// Set a timeout for the request context</span>
            ctx, cancel := context.WithTimeout(r.Context(), timeout)
            <span class="hljs-keyword">defer</span> cancel()

            <span class="hljs-comment">// Create a new request with the updated context</span>
            r = r.WithContext(ctx)

            <span class="hljs-comment">// Call the next handler with the new context</span>
            next.ServeHTTP(w, r)
        })
    }
}

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">longRunningTaskHandler</span><span class="hljs-params">(w http.ResponseWriter, r *http.Request)</span></span> {
    err := longRunningTask(r.Context())
    <span class="hljs-keyword">if</span> err != <span class="hljs-literal">nil</span> {
        <span class="hljs-comment">// if the context expired, return a 504 Gateway Timeout</span>
        <span class="hljs-keyword">if</span> errors.Is(err, context.DeadlineExceeded) {
            w.WriteHeader(http.StatusGatewayTimeout)
            <span class="hljs-keyword">return</span>
        }

        <span class="hljs-comment">// if the task failed for some other reason, return a 500 Internal Server Error</span>
        w.WriteHeader(http.StatusInternalServerError)

        <span class="hljs-keyword">return</span>
    }

    <span class="hljs-comment">// if the task completed successfully, return a 200 OK</span>
    w.WriteHeader(http.StatusOK)
}

<span class="hljs-comment">// longRunningTask is a dummy function that simulates a long running task.</span>
<span class="hljs-comment">// the task will take 10 seconds to complete.</span>
<span class="hljs-comment">// If the context is cancelled before the task completes, it will return the error.</span>
<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">longRunningTask</span><span class="hljs-params">(ctx context.Context)</span> <span class="hljs-title">error</span></span> {
    <span class="hljs-keyword">var</span> dur time.Duration

    <span class="hljs-comment">// for ~50% of the time, the task will take 10 seconds to complete</span>
    <span class="hljs-keyword">if</span> rand.Float64() &gt; <span class="hljs-number">0.5</span> {
        dur = <span class="hljs-number">10</span> * time.Second
    } <span class="hljs-keyword">else</span> {
        dur = <span class="hljs-number">1</span> * time.Second <span class="hljs-comment">// and for the other ~50% of the time, the task will take 1 second to complete</span>
    }

    <span class="hljs-comment">// simulate the task by sleeping for the duration</span>
    <span class="hljs-keyword">select</span> {
    <span class="hljs-keyword">case</span> &lt;-time.After(dur):
        <span class="hljs-keyword">return</span> <span class="hljs-literal">nil</span>
    <span class="hljs-keyword">case</span> &lt;-ctx.Done():
        fmt.Println(<span class="hljs-string">"task cancelled"</span>)
        <span class="hljs-keyword">return</span> ctx.Err()
    }
}
</code></pre>
<p>Run the webserver:</p>
<pre><code class="lang-bash">go run main.go
</code></pre>
<p>and in another terminal:</p>
<pre><code class="lang-bash">curl -i http://localhost:8080
</code></pre>
<p>Curl command will return at most in 5 seconds, even though the task can take up to 10 seconds.</p>
<p>See some runs:</p>
<pre><code class="lang-bash">➜  ~ curl -i http://localhost:8080/
HTTP/1.1 200 OK
Date: Wed, 12 Jun 2024 05:52:18 GMT
Content-Length: 0

➜  ~ curl -i http://localhost:8080/
HTTP/1.1 504 Gateway Timeout
Date: Wed, 12 Jun 2024 05:52:24 GMT
Content-Length: 0

➜  ~ curl -i http://localhost:8080/
HTTP/1.1 200 OK
Date: Wed, 12 Jun 2024 05:52:26 GMT
Content-Length: 0
</code></pre>
<p>As you notice in the second request the server returned 504.</p>
<p>Please run this in your computer and hit CTL-C when the request is running.<br />Notice the output of the webserver and you will see a message like:</p>
<p><code>task cancelled</code></p>
<p>See the related <a target="_blank" href="https://github.com/gosom/gkomdev/blob/main/examples/http-handler-context/main.go">github</a> repo</p>
]]></content:encoded></item><item><title><![CDATA[Unveiling a Powerful Google Maps Scraping Tool]]></title><description><![CDATA[Introduction
In a world where data is increasingly crucial, having the right tools to gather information is essential. This introduces a command-line Google Maps scraper, developed using the Scrapemate web crawling framework. Ideal for business analy...]]></description><link>https://blog.gkomninos.com/unveiling-a-powerful-google-maps-scraping-tool</link><guid isPermaLink="true">https://blog.gkomninos.com/unveiling-a-powerful-google-maps-scraping-tool</guid><category><![CDATA[Scraping]]></category><category><![CDATA[data analysis]]></category><category><![CDATA[google maps]]></category><category><![CDATA[data scraper]]></category><dc:creator><![CDATA[Georgios Komninos]]></dc:creator><pubDate>Sun, 03 Dec 2023 11:39:14 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1701603377999/9159cffe-af14-44c6-85a0-f33042fc4382.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-introduction">Introduction</h2>
<p>In a world where data is increasingly crucial, having the right tools to gather information is essential. This introduces a command-line <a target="_blank" href="https://s.gkomninos.com/s/Ww6eBR">Google Maps scraper</a>, developed using the Scrapemate web crawling framework. Ideal for business analysts, data scientists, or anyone in need of detailed location data, this tool offers an efficient way to extract valuable information from Google Maps.</p>
<p>Try it now: <a target="_blank" href="https://s.gkomninos.com/s/Ww6eBR">https://s.gkomninos.com/s/Ww6eBR</a></p>
<h2 id="heading-what-is-the-google-maps-scraper">What is the Google Maps Scraper?</h2>
<p>The Google Maps scraper is an open source command-line tool designed to extract a wide array of data points from Google Maps. Users have the flexibility to utilize the tool as is or customize its code to fit specific needs.</p>
<h2 id="heading-key-features">Key Features</h2>
<ul>
<li><p><strong>Data Extraction</strong>: Extracts numerous data points from Google Maps.</p>
</li>
<li><p><strong>Export Options</strong>: Allows exporting data in CSV, JSON, or directly to PostgreSQL.</p>
</li>
<li><p><strong>Performance</strong>: Extracts approximately data from 55 places per minute.</p>
</li>
<li><p><strong>Customization</strong>: Extendable for writing custom exporters or code modifications.</p>
</li>
<li><p><strong>Cross-Platform</strong>: Dockerized for compatibility across various platforms.</p>
</li>
<li><p><strong>Scalability</strong>: Suitable for scaling across multiple machines.</p>
</li>
<li><p><strong>Email Extraction</strong>: Optionally extracts emails from business websites.</p>
</li>
</ul>
<h2 id="heading-email-extraction">Email Extraction</h2>
<p>The email extraction feature is disabled by default to optimize performance. When enabled, the scraper visits the website of a listed business and tries to extract emails from the page, primarily from the main page registered on Google Maps. Future updates aim to include extraction from additional pages such as 'About Us' and 'Contact'.</p>
<h2 id="heading-extracted-data-points">Extracted Data Points</h2>
<p>The tool is capable of extracting a comprehensive list of data points including:</p>
<ul>
<li><p>Link, title, category, address</p>
</li>
<li><p>Open hours, popular times, website, phone</p>
</li>
<li><p>Plus code, review count, review rating</p>
</li>
<li><p>Latitude, longitude, and much more</p>
</li>
</ul>
<h2 id="heading-disclaimer">Disclaimer</h2>
<p>Please use this program responsibly. It is important to adhere to Google Maps' terms of service and local laws regarding data scraping and privacy. The tool is provided for legitimate purposes and should be used with ethical considerations in mind.</p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>This <a target="_blank" href="https://s.gkomninos.com/s/Ww6eBR">Google Maps scraper tool</a> offers an easy and efficient way to access detailed location data. Whether for professional data analysis or personal use, this tool is a valuable resource for anyone needing comprehensive data from Google Maps.</p>
]]></content:encoded></item><item><title><![CDATA[PostgreSQL window functions crash course]]></title><description><![CDATA[Introduction
According to PostgreSQL documentation

Window functions provide the ability to perform calculations across sets of rows that are related to the current query row.

Window functions in PostgreSQL are a powerful tool that can significantly...]]></description><link>https://blog.gkomninos.com/postgresql-window-functions-crash-course</link><guid isPermaLink="true">https://blog.gkomninos.com/postgresql-window-functions-crash-course</guid><category><![CDATA[SQL]]></category><category><![CDATA[Databases]]></category><category><![CDATA[PostgreSQL]]></category><category><![CDATA[Dataanalysis]]></category><category><![CDATA[learningSQL]]></category><dc:creator><![CDATA[Georgios Komninos]]></dc:creator><pubDate>Sat, 20 May 2023 10:24:50 GMT</pubDate><content:encoded><![CDATA[<h2 id="heading-introduction">Introduction</h2>
<p>According to <a target="_blank" href="https://www.postgresql.org/docs/15/functions-window.html">PostgreSQL documentation</a></p>
<blockquote>
<p><em>Window functions</em> provide the ability to perform calculations across sets of rows that are related to the current query row.</p>
</blockquote>
<p>Window functions in <a target="_blank" href="https://www.postgresql.org/docs/15/functions-window.html">PostgreSQL</a> are a powerful tool that can significantly enhance your data analysis capabilities. They can perform calculations across a set of related rows, in a way traditional SQL queries can't. In this blog post, I'll walk you through the concept of window functions, their syntax, and several examples to better understand their practical use cases.</p>
<p>The primary goal of this post is that the reader after reading the post and trying all the examples will understand how to use PostgreSQL window functions.</p>
<p>Please also use the <a target="_blank" href="https://github.com/gosom/postgres-window-functions">accompanying GitHub repository</a>.</p>
<p>To follow the crash course I highly encourage you to try the examples.</p>
<h3 id="heading-clone-the-github-repository">Clone the github repository</h3>
<p>First, clone the repository</p>
<p><code>got clone https://github.com/gosom/postgres-window-functions.git</code></p>
<p>Then enter the folder</p>
<p><code>cd postgres-window-functions</code></p>
<p>A <code>docker-compose.yaml</code> file is provided that will spin a PostgreSQL docker container that includes the example tables and the data.</p>
<p>To spin the container: <code>docker-compose up -d</code></p>
<p>You can connect to the database from the command line using:</p>
<p><code>make psql</code></p>
<p>Alternatively, use your favorite PostgreSQL client using the below connection details.</p>
<pre><code class="lang-bash">Hostname: localhost
Port: 5432
Username: postgres
Password: postgres
Database: postgres
</code></pre>
<h2 id="heading-the-dataset">The Dataset</h2>
<p>We'll use a hypothetical dataset from a library system, specifically a <code>read_log</code> table, which tracks the reading habits of various individuals:</p>
<pre><code class="lang-sql"><span class="hljs-keyword">CREATE</span> <span class="hljs-keyword">TABLE</span> read_log (
    <span class="hljs-keyword">id</span> <span class="hljs-built_in">INT</span> <span class="hljs-keyword">GENERATED</span> <span class="hljs-keyword">ALWAYS</span> <span class="hljs-keyword">AS</span> <span class="hljs-keyword">IDENTITY</span> PRIMARY <span class="hljs-keyword">KEY</span>,
    reader_name <span class="hljs-built_in">TEXT</span>,
    book_title <span class="hljs-built_in">TEXT</span>,
    pages <span class="hljs-built_in">INT</span>,
    read_date <span class="hljs-built_in">DATE</span>
);
</code></pre>
<ul>
<li><p><code>id</code>: is an auto-increment integer which is unique per row</p>
</li>
<li><p><code>reader_name</code>: is the name of the reader</p>
</li>
<li><p><code>book_title</code>: is the title of the book</p>
</li>
<li><p><code>read_date</code>: is the date that the reader finished the book</p>
</li>
</ul>
<p>Now, let's populate our <code>read_log</code> table with sample data:</p>
<pre><code class="lang-sql"><span class="hljs-keyword">INSERT</span> <span class="hljs-keyword">INTO</span> read_log (reader_name, book_title, pages, read_date) <span class="hljs-keyword">VALUES</span> 
(<span class="hljs-string">'Giorgos'</span>, <span class="hljs-string">'The Iliad'</span>, <span class="hljs-number">711</span>, <span class="hljs-string">'2023-01-01'</span>),
(<span class="hljs-string">'Giorgos'</span>, <span class="hljs-string">'The Odyssey'</span>, <span class="hljs-number">800</span>, <span class="hljs-string">'2023-02-15'</span>),
(<span class="hljs-string">'Giorgos'</span>, <span class="hljs-string">'The Republic'</span>, <span class="hljs-number">400</span>, <span class="hljs-string">'2023-03-30'</span>),
(<span class="hljs-string">'Emmanouela'</span>, <span class="hljs-string">'Antigone'</span>, <span class="hljs-number">150</span>, <span class="hljs-string">'2023-01-20'</span>),
(<span class="hljs-string">'Emmanouela'</span>, <span class="hljs-string">'Oedipus Rex'</span>, <span class="hljs-number">100</span>, <span class="hljs-string">'2023-02-26'</span>),
(<span class="hljs-string">'Emmanouela'</span>, <span class="hljs-string">'The Symposium'</span>, <span class="hljs-number">200</span>, <span class="hljs-string">'2023-04-10'</span>),
(<span class="hljs-string">'Eleni'</span>, <span class="hljs-string">'The Histories'</span>, <span class="hljs-number">900</span>, <span class="hljs-string">'2023-02-01'</span>),
(<span class="hljs-string">'Eleni'</span>, <span class="hljs-string">'Works and Days'</span>, <span class="hljs-number">150</span>, <span class="hljs-string">'2023-03-15'</span>),
(<span class="hljs-string">'Eleni'</span>, <span class="hljs-string">'Prometheus Bound'</span>, <span class="hljs-number">80</span>, <span class="hljs-string">'2023-04-25'</span>),
(<span class="hljs-string">'Eleni'</span>, <span class="hljs-string">'Metamorphoses'</span>, <span class="hljs-number">480</span>, <span class="hljs-string">'2023-02-15'</span>),
(<span class="hljs-string">'Konstantina'</span>, <span class="hljs-string">'The Iliad'</span>, <span class="hljs-number">711</span>, <span class="hljs-string">'2023-01-22'</span>),
(<span class="hljs-string">'Konstantina'</span>, <span class="hljs-string">'The Odyssey'</span>, <span class="hljs-number">800</span>, <span class="hljs-string">'2023-03-18'</span>),
(<span class="hljs-string">'Konstantina'</span>, <span class="hljs-string">'The Symposium'</span>, <span class="hljs-number">200</span>, <span class="hljs-string">'2023-04-30'</span>),
(<span class="hljs-string">'Andreas'</span>, <span class="hljs-string">'Antigone'</span>, <span class="hljs-number">150</span>, <span class="hljs-string">'2023-01-30'</span>),
(<span class="hljs-string">'Andreas'</span>, <span class="hljs-string">'Oedipus Rex'</span>, <span class="hljs-number">100</span>, <span class="hljs-string">'2023-02-20'</span>),
(<span class="hljs-string">'Andreas'</span>, <span class="hljs-string">'The Republic'</span>, <span class="hljs-number">400</span>, <span class="hljs-string">'2023-04-18'</span>),
(<span class="hljs-string">'Andreas'</span>, <span class="hljs-string">'Metamorphoses'</span>, <span class="hljs-number">480</span>, <span class="hljs-string">'2023-05-08'</span>);
</code></pre>
<h2 id="heading-understanding-window-functions">Understanding Window Functions</h2>
<p>A window function performs a calculation across a set of table rows that are related to the current row. It's like an advanced version of an aggregation function (like <code>SUM()</code>, <code>COUNT()</code>, etc.), but instead of collapsing all the rows into a single output row, it maintains the separate rows.</p>
<h3 id="heading-syntax">Syntax</h3>
<pre><code class="lang-sql">window_function (expression) OVER (
    [PARTITION BY partition_expression]
    [ORDER BY sort_expression]
    [frame_clause]
)
</code></pre>
<p>Let's break this down:</p>
<ul>
<li><p><code>window function (expression)</code> : this part is similar to regular SQL function calls. The <code>expression</code> is the subject of the calculation. For example, if we are using the <code>SUM()</code> window function, the expression would be the column we want to sum.</p>
</li>
<li><p><code>OVER(...)</code>: A window function will always include an <code>OVER</code> clause. It defines the 'window' or set of rows the function operates on.</p>
</li>
<li><p><code>PARTITION BY partition_expression</code>: This is used to break the data into smaller partitions. The window function is applied within each of these partitions. If it's not specified, the window function treats all rows of the result set as a single partition.</p>
</li>
<li><p><code>ORDER BY sort_expression</code>: This clause determines the order in which rows are processed by the window function. Rows are ordered according to this expression before the function is applied.</p>
</li>
<li><p><code>frame_clause</code>: This clause further refines the window within a partition that the window function operates on.<br />  This is a more complex part, that I will cover later. For now, focus understanding the basic syntax and capabilities of window functions.</p>
</li>
</ul>
<p><strong>Let's see the syntax in action:</strong></p>
<p><em><mark>Let's suppose that we want to get all the rows from the </mark></em> <code>read_log</code> <em><mark>table but we additionally want in each row the </mark></em> <code>tp</code><em><mark>, which is the total number of pages for all readers.</mark></em></p>
<p><strong><em>Without window functions:</em></strong></p>
<p>One way to do it:</p>
<pre><code class="lang-sql"><span class="hljs-keyword">WITH</span> cte <span class="hljs-keyword">AS</span> (
    <span class="hljs-keyword">SELECT</span> 
        <span class="hljs-keyword">sum</span>(pages) <span class="hljs-keyword">as</span> tp 
    <span class="hljs-keyword">FROM</span> 
        read_log
)
<span class="hljs-keyword">SELECT</span> 
    read_log.*, cte.tp 
<span class="hljs-keyword">FROM</span> 
    read_log, cte;
</code></pre>
<p>Result set:</p>
<pre><code class="lang-sql">  id | reader_name |    book_title    | pages | read_date  |  tp  
<span class="hljs-comment">----+-------------+------------------+-------+------------+------</span>
  1 | Giorgos     | The Iliad        |   711 | 2023-01-01 | 6812
  2 | Giorgos     | The Odyssey      |   800 | 2023-02-15 | 6812
  3 | Giorgos     | The Republic     |   400 | 2023-03-30 | 6812
  4 | Emmanouela  | Antigone         |   150 | 2023-01-20 | 6812
  5 | Emmanouela  | Oedipus Rex      |   100 | 2023-02-26 | 6812
  6 | Emmanouela  | The Symposium    |   200 | 2023-04-10 | 6812
  7 | Eleni       | The Histories    |   900 | 2023-02-01 | 6812
  8 | Eleni       | Works and Days   |   150 | 2023-03-15 | 6812
  9 | Eleni       | Prometheus Bound |    80 | 2023-04-25 | 6812
 10 | Eleni       | Metamorphoses    |   480 | 2023-02-15 | 6812
 11 | Konstantina | The Iliad        |   711 | 2023-01-22 | 6812
 12 | Konstantina | The Odyssey      |   800 | 2023-03-18 | 6812
 13 | Konstantina | The Symposium    |   200 | 2023-04-30 | 6812
 14 | Andreas     | Antigone         |   150 | 2023-01-30 | 6812
 15 | Andreas     | Oedipus Rex      |   100 | 2023-02-20 | 6812
 16 | Andreas     | The Republic     |   400 | 2023-04-18 | 6812
 17 | Andreas     | Metamorphoses    |   480 | 2023-05-08 | 6812
</code></pre>
<p><strong><em>Using Window Function</em></strong></p>
<pre><code class="lang-sql"><span class="hljs-keyword">SELECT</span> 
    *, <span class="hljs-keyword">sum</span>(tp) <span class="hljs-keyword">OVER</span>() 
<span class="hljs-keyword">FROM</span> 
    read_log;
</code></pre>
<p><em>Notice, here that we are using an empty</em> <code>OVER</code> <em>clause, we want our window function to operate in all the rows.</em></p>
<p>Result set:</p>
<pre><code class="lang-sql"> id | reader_name |    book_title    | pages | read_date  | tp  
<span class="hljs-comment">----+-------------+------------------+-------+------------+------</span>
  1 | Giorgos     | The Iliad        |   711 | 2023-01-01 | 6812
  2 | Giorgos     | The Odyssey      |   800 | 2023-02-15 | 6812
  3 | Giorgos     | The Republic     |   400 | 2023-03-30 | 6812
  4 | Emmanouela  | Antigone         |   150 | 2023-01-20 | 6812
  5 | Emmaounela  | Oedipus Rex      |   100 | 2023-02-26 | 6812
  6 | Emmanouela  | The Symposium    |   200 | 2023-04-10 | 6812
  7 | Eleni       | The Histories    |   900 | 2023-02-01 | 6812
  8 | Eleni       | Works and Days   |   150 | 2023-03-15 | 6812
  9 | Eleni       | Prometheus Bound |    80 | 2023-04-25 | 6812
 10 | Eleni       | Metamorphoses    |   480 | 2023-02-15 | 6812
 11 | Konstantina | The Iliad        |   711 | 2023-01-22 | 6812
 12 | Konstantina | The Odyssey      |   800 | 2023-03-18 | 6812
 13 | Konstantina | The Symposium    |   200 | 2023-04-30 | 6812
 14 | Andreas     | Antigone         |   150 | 2023-01-30 | 6812
 15 | Andreas     | Oedipus Rex      |   100 | 2023-02-20 | 6812
 16 | Andreas     | The Republic     |   400 | 2023-04-18 | 6812
 17 | Andreas     | Metamorphoses    |   480 | 2023-05-08 | 6812
</code></pre>
<p><em><mark>Our requirement has changed and now we need the </mark></em> <code>tp</code> <em><mark>to be the total pages by each reader_name</mark></em></p>
<p><strong>Without Window functions</strong></p>
<pre><code class="lang-sql"><span class="hljs-keyword">WITH</span> cte <span class="hljs-keyword">AS</span> (
    <span class="hljs-keyword">SELECT</span> 
    reader_name, <span class="hljs-keyword">sum</span>(pages) <span class="hljs-keyword">AS</span> tp 
    <span class="hljs-keyword">FROM</span> 
        read_log 
    <span class="hljs-keyword">GROUP</span> <span class="hljs-keyword">BY</span> 
        reader_name
)
<span class="hljs-keyword">SELECT</span> 
    read_log.*, cte.tp 
<span class="hljs-keyword">FROM</span> 
    read_log, cte 
<span class="hljs-keyword">WHERE</span> 
    read_log.reader_name = cte.reader_name;
</code></pre>
<p>Result Set:</p>
<pre><code class="lang-sql"> id | reader_name |    book_title    | pages | read_date  |  tp  
<span class="hljs-comment">----+-------------+------------------+-------+------------+------</span>
  1 | Giorgos     | The Iliad        |   711 | 2023-01-01 | 1911
  2 | Giorgos     | The Odyssey      |   800 | 2023-02-15 | 1911
  3 | Giorgos     | The Republic     |   400 | 2023-03-30 | 1911
  4 | Emmanouela  | Antigone         |   150 | 2023-01-20 |  450
  5 | Emmanouela  | Oedipus Rex      |   100 | 2023-02-26 |  450
  6 | Emmanouela  | The Symposium    |   200 | 2023-04-10 |  450
  7 | Eleni       | The Histories    |   900 | 2023-02-01 | 1610
  8 | Eleni       | Works and Days   |   150 | 2023-03-15 | 1610
  9 | Eleni       | Prometheus Bound |    80 | 2023-04-25 | 1610
 10 | Eleni       | Metamorphoses    |   480 | 2023-02-15 | 1610
 11 | Konstantina | The Iliad        |   711 | 2023-01-22 | 1711
 12 | Konstantina | The Odyssey      |   800 | 2023-03-18 | 1711
 13 | Konstantina | The Symposium    |   200 | 2023-04-30 | 1711
 14 | Andreas     | Antigone         |   150 | 2023-01-30 | 1130
 15 | Andreas     | Oedipus Rex      |   100 | 2023-02-20 | 1130
 16 | Andreas     | The Republic     |   400 | 2023-04-18 | 1130
 17 | Andreas     | Metamorphoses    |   480 | 2023-05-08 | 1130
</code></pre>
<p><strong>With window functions</strong></p>
<pre><code class="lang-sql"><span class="hljs-keyword">SELECT</span>
    *, <span class="hljs-keyword">sum</span>(pages) <span class="hljs-keyword">OVER</span>(<span class="hljs-keyword">PARTITION</span> <span class="hljs-keyword">BY</span> reader_name) <span class="hljs-keyword">AS</span> tp
<span class="hljs-keyword">FROM</span>
    read_log;
</code></pre>
<p><em>Notice, here that we are using</em> <code>PARTITION BY</code> in the <code>OVER</code> clause, since we want to calculate the <code>SUM</code> of pages per reader_name.</p>
<p>Result Set:</p>
<pre><code class="lang-sql"> id | reader_name |    book_title    | pages | read_date  |  tp  
<span class="hljs-comment">----+-------------+------------------+-------+------------+------</span>
 14 | Andreas     | Antigone         |   150 | 2023-01-30 | 1130
 17 | Andreas     | Metamorphoses    |   480 | 2023-05-08 | 1130
 16 | Andreas     | The Republic     |   400 | 2023-04-18 | 1130
 15 | Andreas     | Oedipus Rex      |   100 | 2023-02-20 | 1130
  9 | Eleni       | Prometheus Bound |    80 | 2023-04-25 | 1610
  7 | Eleni       | The Histories    |   900 | 2023-02-01 | 1610
  8 | Eleni       | Works and Days   |   150 | 2023-03-15 | 1610
 10 | Eleni       | Metamorphoses    |   480 | 2023-02-15 | 1610
  6 | Emmanouela  | The Symposium    |   200 | 2023-04-10 |  450
  5 | Emmanouela  | Oedipus Rex      |   100 | 2023-02-26 |  450
  4 | Emmanouela  | Antigone         |   150 | 2023-01-20 |  450
  1 | Giorgos     | The Iliad        |   711 | 2023-01-01 | 1911
  2 | Giorgos     | The Odyssey      |   800 | 2023-02-15 | 1911
  3 | Giorgos     | The Republic     |   400 | 2023-03-30 | 1911
 12 | Konstantina | The Odyssey      |   800 | 2023-03-18 | 1711
 11 | Konstantina | The Iliad        |   711 | 2023-01-22 | 1711
 13 | Konstantina | The Symposium    |   200 | 2023-04-30 | 1711
</code></pre>
<p>Note: Notice that the results are in different order. This is not an issue since we didn't have any order requirements.</p>
<p><mark>Let's now change our requirement and we want the running total of pages read by each reader, ordered by the date when they finished each book.<br />So, for example for reader </mark> <code>Andreas</code> <mark>we need an output like:</mark></p>
<pre><code class="lang-sql">14 | Andreas | Antigone | 150 | 2023-01-30 | 150 
15 | Andreas | Oedipus Rex | 100 | 2023-02-20 | 250 
16 | Andreas | The Republic | 400 | 2023-04-18 | 650 
17 | Andreas | Metamorphoses | 480 | 2023-05-08 | 1130
</code></pre>
<p><strong>Without window functions</strong> this is not so trivial:</p>
<pre><code class="lang-sql"><span class="hljs-keyword">WITH</span> A <span class="hljs-keyword">AS</span> (                                      
    <span class="hljs-keyword">SELECT</span>                                                                   
        read_date,
        reader_name,
        <span class="hljs-keyword">SUM</span>(pages) <span class="hljs-keyword">AS</span> tp
    <span class="hljs-keyword">FROM</span>(
        <span class="hljs-keyword">SELECT</span>
            t1.read_date,
            t1.reader_name,
            t2.pages
        <span class="hljs-keyword">FROM</span> 
            read_log t1
        <span class="hljs-keyword">JOIN</span> 
            read_log t2
        <span class="hljs-keyword">ON</span> 
            t1.reader_name = t2.reader_name
        <span class="hljs-keyword">AND</span> t1.read_date &gt;= t2.read_date
    ) t3
    <span class="hljs-keyword">GROUP</span> <span class="hljs-keyword">BY</span>
        read_date,
        reader_name
)
<span class="hljs-keyword">SELECT</span> 
    read_log.*,
    A.tp
<span class="hljs-keyword">FROM</span> 
    read_log
<span class="hljs-keyword">JOIN</span> A <span class="hljs-keyword">ON</span> 
    read_log.read_date = A.read_date
    <span class="hljs-keyword">AND</span> read_log.reader_name = A.reader_name
;
</code></pre>
<p>Result Set</p>
<pre><code class="lang-sql"> id | reader_name |    book_title    | pages | read_date  |  tp  
<span class="hljs-comment">----+-------------+------------------+-------+------------+------</span>
  1 | Giorgos     | The Iliad        |   711 | 2023-01-01 |  711
  2 | Giorgos     | The Odyssey      |   800 | 2023-02-15 | 1511
  3 | Giorgos     | The Republic     |   400 | 2023-03-30 | 1911
  4 | Emmanouela  | Antigone         |   150 | 2023-01-20 |  150
  5 | Emmanouela  | Oedipus Rex      |   100 | 2023-02-26 |  250
  6 | Emmanouela  | The Symposium    |   200 | 2023-04-10 |  450
  7 | Eleni       | The Histories    |   900 | 2023-02-01 |  900
  8 | Eleni       | Works and Days   |   150 | 2023-03-15 | 1530
  9 | Eleni       | Prometheus Bound |    80 | 2023-04-25 | 1610
 10 | Eleni       | Metamorphoses    |   480 | 2023-02-15 | 1380
 11 | Konstantina | The Iliad        |   711 | 2023-01-22 |  711
 12 | Konstantina | The Odyssey      |   800 | 2023-03-18 | 1511
 13 | Konstantina | The Symposium    |   200 | 2023-04-30 | 1711
 14 | Andreas     | Antigone         |   150 | 2023-01-30 |  150
 15 | Andreas     | Oedipus Rex      |   100 | 2023-02-20 |  250
 16 | Andreas     | The Republic     |   400 | 2023-04-18 |  650
 17 | Andreas     | Metamorphoses    |   480 | 2023-05-08 | 1130
</code></pre>
<p><strong>Now using Window Function</strong></p>
<pre><code class="lang-sql"><span class="hljs-keyword">SELECT</span> 
    *, <span class="hljs-keyword">sum</span>(pages) <span class="hljs-keyword">OVER</span> (<span class="hljs-keyword">PARTITION</span> <span class="hljs-keyword">BY</span> reader_name <span class="hljs-keyword">ORDER</span> <span class="hljs-keyword">BY</span> read_date) <span class="hljs-keyword">AS</span> tp
<span class="hljs-keyword">FROM</span>
    read_log
;
</code></pre>
<p>Result Set</p>
<pre><code class="lang-sql"> id | reader_name |    book_title    | pages | read_date  |  tp  
<span class="hljs-comment">----+-------------+------------------+-------+------------+------</span>
 14 | Andreas     | Antigone         |   150 | 2023-01-30 |  150
 15 | Andreas     | Oedipus Rex      |   100 | 2023-02-20 |  250
 16 | Andreas     | The Republic     |   400 | 2023-04-18 |  650
 17 | Andreas     | Metamorphoses    |   480 | 2023-05-08 | 1130
  7 | Eleni       | The Histories    |   900 | 2023-02-01 |  900
 10 | Eleni       | Metamorphoses    |   480 | 2023-02-15 | 1380
  8 | Eleni       | Works and Days   |   150 | 2023-03-15 | 1530
  9 | Eleni       | Prometheus Bound |    80 | 2023-04-25 | 1610
  4 | Emmanouela  | Antigone         |   150 | 2023-01-20 |  150
  5 | Emmanouela  | Oedipus Rex      |   100 | 2023-02-26 |  250
  6 | Emmanouela  | The Symposium    |   200 | 2023-04-10 |  450
  1 | Giorgos     | The Iliad        |   711 | 2023-01-01 |  711
  2 | Giorgos     | The Odyssey      |   800 | 2023-02-15 | 1511
  3 | Giorgos     | The Republic     |   400 | 2023-03-30 | 1911
 11 | Konstantina | The Iliad        |   711 | 2023-01-22 |  711
 12 | Konstantina | The Odyssey      |   800 | 2023-03-18 | 1511
 13 | Konstantina | The Symposium    |   200 | 2023-04-30 | 1711
</code></pre>
<p>Please pay attention how the window function behaves differently comparing to the behavior without the <code>ORDER BY</code> in the previous example.<br /><em>Rows are ordered according to this expression</em> <strong><em>before</em></strong> <em>the function is applied.</em></p>
<p><mark>Now we want to know the moving average of pages each user reads.</mark></p>
<p>Example output for reader_name <code>Giorgos</code></p>
<p><strong>Without window functions</strong></p>
<pre><code class="lang-sql"><span class="hljs-keyword">WITH</span> A <span class="hljs-keyword">AS</span> (                                      
    <span class="hljs-keyword">SELECT</span>                                                                   
        read_date,
        reader_name,
        <span class="hljs-keyword">AVG</span>(pages) <span class="hljs-keyword">AS</span> moving_avg
    <span class="hljs-keyword">FROM</span>(
        <span class="hljs-keyword">SELECT</span>
            t1.read_date,
            t1.reader_name,
            t2.pages
        <span class="hljs-keyword">FROM</span> 
            read_log t1
        <span class="hljs-keyword">JOIN</span> 
            read_log t2
        <span class="hljs-keyword">ON</span> 
            t1.reader_name = t2.reader_name
        <span class="hljs-keyword">AND</span> t1.read_date &gt;= t2.read_date
    ) t3
    <span class="hljs-keyword">GROUP</span> <span class="hljs-keyword">BY</span>
        read_date,
        reader_name
)
<span class="hljs-keyword">SELECT</span> 
    read_log.reader_name,
    read_log.read_date,
    A.moving_avg
<span class="hljs-keyword">FROM</span> 
    read_log
<span class="hljs-keyword">JOIN</span> A <span class="hljs-keyword">ON</span> 
    read_log.read_date = A.read_date
    <span class="hljs-keyword">AND</span> read_log.reader_name = A.reader_name
;
</code></pre>
<p>Result set:</p>
<pre><code class="lang-sql"> reader_name | read_date  |      moving_avg      
<span class="hljs-comment">-------------+------------+----------------------</span>
 Giorgos     | 2023-01-01 | 711.0000000000000000
 Giorgos     | 2023-02-15 | 755.5000000000000000
 Giorgos     | 2023-03-30 | 637.0000000000000000
 Emmanouela  | 2023-01-20 | 150.0000000000000000
 Emmanouela  | 2023-02-26 | 125.0000000000000000
 Emmanouela  | 2023-04-10 | 150.0000000000000000
 Eleni       | 2023-02-01 | 900.0000000000000000
 Eleni       | 2023-03-15 | 510.0000000000000000
 Eleni       | 2023-04-25 | 402.5000000000000000
 Eleni       | 2023-02-15 | 690.0000000000000000
 Konstantina | 2023-01-22 | 711.0000000000000000
 Konstantina | 2023-03-18 | 755.5000000000000000
 Konstantina | 2023-04-30 | 570.3333333333333333
 Andreas     | 2023-01-30 | 150.0000000000000000
 Andreas     | 2023-02-20 | 125.0000000000000000
 Andreas     | 2023-04-18 | 216.6666666666666667
 Andreas     | 2023-05-08 | 282.5000000000000000
</code></pre>
<p><mark>Let's find the average pages read by each reader, considering only the current book and one book prior, so for reader Giorgos the first value should be 711, the second (711+800)/2 = 755, the third (800+400)/2=600</mark></p>
<p><strong>Without window functions</strong></p>
<p>I cannot think of an obvious way to do that. I am pretty sure that this is possible although.</p>
<p><strong>Using window functions</strong></p>
<p>If you see the window functions syntax there is the <code>frame_clause</code>. It can be very helpful when you need more control over the rows that are being considered by the window function. We are going to utilize that here.</p>
<pre><code class="lang-sql"><span class="hljs-keyword">SELECT</span> 
    reader_name, 
    read_date,
    <span class="hljs-keyword">AVG</span>(pages) <span class="hljs-keyword">OVER</span>(<span class="hljs-keyword">PARTITION</span> <span class="hljs-keyword">BY</span> reader_name <span class="hljs-keyword">ORDER</span> <span class="hljs-keyword">BY</span> read_date <span class="hljs-keyword">ROWS</span> <span class="hljs-keyword">BETWEEN</span> <span class="hljs-number">1</span> <span class="hljs-keyword">PRECEDING</span> <span class="hljs-keyword">AND</span> <span class="hljs-keyword">CURRENT</span> <span class="hljs-keyword">ROW</span>) <span class="hljs-keyword">as</span> moving_avg
<span class="hljs-keyword">FROM</span> 
    read_log
;
</code></pre>
<p>Notice now that we use a <code>frame_clause</code> .</p>
<pre><code class="lang-sql">ROWS BETWEEN 1 PRECEDING AND CURRENT ROW
</code></pre>
<p>which further refines the 'window' that the window functions operate on.<br />In our case, it controls the number of rows included in the calculation.</p>
<p>A <code>frame_clause</code> can be <code>RANGE</code>, <code>ROWS</code>, <code>GROUPS</code> . Here we used <code>ROWS</code> . I am not going to explain further the <code>frame_clause</code>.<br />You can search online or read the <a target="_blank" href="https://www.postgresql.org/docs/current/sql-expressions.html#SYNTAX-WINDOW-FUNCTIONS">documentation</a> for further examples.</p>
<p>Result Set</p>
<pre><code class="lang-sql"> reader_name | read_date  |      moving_avg      
<span class="hljs-comment">-------------+------------+----------------------</span>
 Andreas     | 2023-01-30 | 150.0000000000000000
 Andreas     | 2023-02-20 | 125.0000000000000000
 Andreas     | 2023-04-18 | 250.0000000000000000
 Andreas     | 2023-05-08 | 440.0000000000000000
 Eleni       | 2023-02-01 | 900.0000000000000000
 Eleni       | 2023-02-15 | 690.0000000000000000
 Eleni       | 2023-03-15 | 315.0000000000000000
 Eleni       | 2023-04-25 | 115.0000000000000000
 Emmanouela  | 2023-01-20 | 150.0000000000000000
 Emmanouela  | 2023-02-26 | 125.0000000000000000
 Emmanouela  | 2023-04-10 | 150.0000000000000000
 Giorgos     | 2023-01-01 | 711.0000000000000000
 Giorgos     | 2023-02-15 | 755.5000000000000000
 Giorgos     | 2023-03-30 | 600.0000000000000000
 Konstantina | 2023-01-22 | 711.0000000000000000
 Konstantina | 2023-03-18 | 755.5000000000000000
 Konstantina | 2023-04-30 | 500.0000000000000000
</code></pre>
<h2 id="heading-window-functions-in-action"><strong>Window Functions in Action</strong></h2>
<p>Now that we understand the syntax and concept of window functions, let's take a look at some examples using different window functions on our <code>read_log</code> table.</p>
<h3 id="heading-ranking-functions">Ranking Functions</h3>
<p>Ranking functions are a category of functions that assign a unique rank to each row within the window partition. This is is useful when you want to find top-N or bottom-N rows.</p>
<p><strong>RANK()</strong></p>
<p><code>Rank()</code> function assigns a unique rank to each distinct row.</p>
<p>We want to rank the readers by the total number of pages they have read:</p>
<pre><code class="lang-sql"><span class="hljs-keyword">SELECT</span> 
    reader_name, 
    <span class="hljs-keyword">SUM</span>(pages) <span class="hljs-keyword">as</span> total_pages, 
    <span class="hljs-keyword">RANK</span>() <span class="hljs-keyword">OVER</span>(<span class="hljs-keyword">ORDER</span> <span class="hljs-keyword">BY</span> <span class="hljs-keyword">SUM</span>(pages) <span class="hljs-keyword">DESC</span>) <span class="hljs-keyword">as</span> <span class="hljs-keyword">rank</span>
<span class="hljs-keyword">FROM</span> 
    read_log
<span class="hljs-keyword">GROUP</span> <span class="hljs-keyword">BY</span> 
reader_name
;
</code></pre>
<p>Result Set:</p>
<pre><code class="lang-sql"> reader_name | total_pages | rank 
<span class="hljs-comment">-------------+-------------+------</span>
 Giorgos     |        1911 |    1
 Konstantina |        1711 |    2
 Eleni       |        1610 |    3
 Andreas     |        1130 |    4
 Emmanouela  |         450 |    5
</code></pre>
<p>Now I would like you to insert this into our database</p>
<pre><code class="lang-sql"><span class="hljs-keyword">INSERT</span> <span class="hljs-keyword">INTO</span> read_log
(reader_name, book_title, pages, read_date) 
<span class="hljs-keyword">VALUES</span>
(<span class="hljs-string">'Konstantina'</span>, <span class="hljs-string">'The Orestia'</span>, <span class="hljs-number">200</span>, <span class="hljs-string">'2023-05-15'</span>);
</code></pre>
<p>We would like readers <code>Giorgos</code> and <code>Konstatina</code> to have read the same amount of pages.</p>
<p>Run the query again:</p>
<pre><code class="lang-sql"><span class="hljs-keyword">SELECT</span> 
    reader_name, 
    <span class="hljs-keyword">SUM</span>(pages) <span class="hljs-keyword">as</span> total_pages, 
    <span class="hljs-keyword">RANK</span>() <span class="hljs-keyword">OVER</span>(<span class="hljs-keyword">ORDER</span> <span class="hljs-keyword">BY</span> <span class="hljs-keyword">SUM</span>(pages) <span class="hljs-keyword">DESC</span>) <span class="hljs-keyword">as</span> <span class="hljs-keyword">rank</span>
<span class="hljs-keyword">FROM</span> 
    read_log
<span class="hljs-keyword">GROUP</span> <span class="hljs-keyword">BY</span> 
reader_name
;
</code></pre>
<p>Notice the result now:</p>
<pre><code class="lang-sql"> reader_name | total_pages | rank 
<span class="hljs-comment">-------------+-------------+------</span>
 Giorgos     |        1911 |    1
 Konstantina |        1911 |    1
 Eleni       |        1610 |    3
 Andreas     |        1130 |    4
 Emmanouela  |         450 |    5
</code></pre>
<p><code>Giorgos</code> and <code>Konstantina</code> have the same rank but there is gap in rank column (it goes from 1 to 3).</p>
<p><strong>DENSE_RANK()</strong></p>
<p><code>DENSE_RANK()</code> works similarly to <code>RANK()</code> , but it doesn't leave gaps between groups of duplicate values.</p>
<pre><code class="lang-sql"><span class="hljs-keyword">SELECT</span> 
    reader_name, 
    <span class="hljs-keyword">SUM</span>(pages) <span class="hljs-keyword">as</span> total_pages, 
    <span class="hljs-keyword">DENSE_RANK</span>() <span class="hljs-keyword">OVER</span>(<span class="hljs-keyword">ORDER</span> <span class="hljs-keyword">BY</span> <span class="hljs-keyword">SUM</span>(pages) <span class="hljs-keyword">DESC</span>) <span class="hljs-keyword">as</span> <span class="hljs-keyword">rank</span>
<span class="hljs-keyword">FROM</span> 
    read_log
<span class="hljs-keyword">GROUP</span> <span class="hljs-keyword">BY</span> 
reader_name
;
</code></pre>
<p>Result:</p>
<pre><code class="lang-sql"> reader_name | total_pages | rank 
<span class="hljs-comment">-------------+-------------+------</span>
 Giorgos     |        1911 |    1
 Konstantina |        1911 |    1
 Eleni       |        1610 |    2
 Andreas     |        1130 |    3
 Emmanouela  |         450 |    4
</code></pre>
<p>Notice that now there is no gap and reader <code>Eleni</code> has rank 2.</p>
<p><strong>ROW_NUMBER()</strong></p>
<p><code>ROW_NUMBER()</code> assigns a unique number to each row in the results. This function is useful when you need a unique value for each row belonging to the partition.</p>
<pre><code class="lang-sql"><span class="hljs-keyword">SELECT</span> 
    reader_name, 
    book_title, 
    read_date,
    ROW_NUMBER() <span class="hljs-keyword">OVER</span>(<span class="hljs-keyword">PARTITION</span> <span class="hljs-keyword">BY</span> reader_name <span class="hljs-keyword">ORDER</span> <span class="hljs-keyword">BY</span> read_date) <span class="hljs-keyword">as</span> read_order
<span class="hljs-keyword">FROM</span> read_log;
</code></pre>
<p>This query will give a unique row number to each book read by the same reader, ordered by the date they finished each book.</p>
<h3 id="heading-aggregate-functions"><strong>Aggregate Functions</strong></h3>
<p>I have already shown you a couple of examples of aggregate functions <code>SUM()</code> and <code>AVG()</code> .</p>
<p><strong>SUM()</strong></p>
<p><code>SUM()</code> can be useful to calculate running totals.</p>
<p><strong>AVG()</strong></p>
<p><code>AVG()</code> can be useful to calculate moving averages.</p>
<p>You can use any <a target="_blank" href="https://www.postgresql.org/docs/15/functions-aggregate.html">builtin</a> aggregate function as a window function.</p>
<h3 id="heading-general-purpose-window-functions">General Purpose Window Functions</h3>
<p><strong>FIRST VALUE() and LAST_VALUE()</strong></p>
<p><code>FIRST_VALUE()</code> and <code>LAST_VALUE()</code> are window functions that return the first and last value of an ordered set of values. Let's see which book each reader started and ended with:</p>
<pre><code class="lang-sql"><span class="hljs-keyword">SELECT</span> 
    <span class="hljs-keyword">DISTINCT</span> <span class="hljs-keyword">ON</span> (reader_name) reader_name, 
    <span class="hljs-keyword">FIRST_VALUE</span>(book_title) <span class="hljs-keyword">OVER</span>(<span class="hljs-keyword">PARTITION</span> <span class="hljs-keyword">BY</span> reader_name <span class="hljs-keyword">ORDER</span> <span class="hljs-keyword">BY</span> read_date) <span class="hljs-keyword">as</span> first_book,
    <span class="hljs-keyword">LAST_VALUE</span>(book_title) <span class="hljs-keyword">OVER</span>(<span class="hljs-keyword">PARTITION</span> <span class="hljs-keyword">BY</span> reader_name 
                                <span class="hljs-keyword">ORDER</span> <span class="hljs-keyword">BY</span> read_date 
                                <span class="hljs-keyword">ROWS</span> <span class="hljs-keyword">BETWEEN</span> <span class="hljs-keyword">UNBOUNDED</span> <span class="hljs-keyword">PRECEDING</span> <span class="hljs-keyword">AND</span> <span class="hljs-keyword">UNBOUNDED</span> <span class="hljs-keyword">FOLLOWING</span>) <span class="hljs-keyword">as</span> last_book
</code></pre>
<p>Note from the docs:</p>
<blockquote>
<p>Note that <code>first_value</code>, <code>last_value</code>, and <code>nth_value</code> consider only the rows within the “window frame”, which by default contains the rows from the start of the partition through the last peer of the current row. This is likely to give unhelpful results for <code>last_value</code> and sometimes also <code>nth_value</code>. You can redefine the frame by adding a suitable frame specification (<code>RANGE</code>, <code>ROWS</code> or <code>GROUPS</code>) to the <code>OVER</code> clause. See <a target="_blank" href="https://www.postgresql.org/docs/15/sql-expressions.html#SYNTAX-WINDOW-FUNCTIONS"><strong>Section 4.2.8</strong></a> for more information about frame specifications.</p>
<p>...</p>
<p>To obtain aggregation over the whole partition, omit <code>ORDER BY</code> or use <code>ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING</code>.</p>
</blockquote>
<p>The <code>ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING</code> clause ensures that the window frame includes all rows in the partition.</p>
<p>Based on the docs we can also write the query like:</p>
<pre><code class="lang-sql"><span class="hljs-keyword">SELECT</span> 
    <span class="hljs-keyword">DISTINCT</span> <span class="hljs-keyword">ON</span> (reader_name) reader_name, 
    <span class="hljs-keyword">FIRST_VALUE</span>(book_title) <span class="hljs-keyword">OVER</span>(<span class="hljs-keyword">PARTITION</span> <span class="hljs-keyword">BY</span> reader_name <span class="hljs-keyword">ORDER</span> <span class="hljs-keyword">BY</span> read_date) <span class="hljs-keyword">as</span> first_book,
    <span class="hljs-keyword">LAST_VALUE</span>(book_title) <span class="hljs-keyword">OVER</span>(<span class="hljs-keyword">PARTITION</span> <span class="hljs-keyword">BY</span> reader_name )
<span class="hljs-keyword">FROM</span> read_log;
</code></pre>
<p>Result:</p>
<pre><code class="lang-sql"> reader_name |  first_book   |    last_book     
<span class="hljs-comment">-------------+---------------+------------------</span>
 Andreas     | Antigone      | Metamorphoses
 Eleni       | The Histories | Prometheus Bound
 Emmanouela  | Antigone      | The Symposium
 Giorgos     | The Iliad     | The Republic
 Konstantina | The Iliad     | The Orestia
</code></pre>
<p><strong>LAG() and LEAD()</strong></p>
<p><code>LAG()</code> and <code>LEAD()</code> functions allow you to fetch the value of a previous row or following row within your data partition.</p>
<p>For example, you can use <code>LAG()</code> to get the book each reader read before the current one:</p>
<pre><code class="lang-sql"><span class="hljs-keyword">SELECT</span> 
    reader_name, 
    book_title, 
    LAG(book_title) <span class="hljs-keyword">OVER</span>(<span class="hljs-keyword">PARTITION</span> <span class="hljs-keyword">BY</span> reader_name <span class="hljs-keyword">ORDER</span> <span class="hljs-keyword">BY</span> read_date) <span class="hljs-keyword">as</span> previous_book
<span class="hljs-keyword">FROM</span> 
read_log;
</code></pre>
<p>Result:</p>
<pre><code class="lang-sql"> reader_name |    book_title    | previous_book  
<span class="hljs-comment">-------------+------------------+----------------</span>
 Andreas     | Antigone         | 
 Andreas     | Oedipus Rex      | Antigone
 Andreas     | The Republic     | Oedipus Rex
 Andreas     | Metamorphoses    | The Republic
 Eleni       | The Histories    | 
 Eleni       | Metamorphoses    | The Histories
 Eleni       | Works and Days   | Metamorphoses
 Eleni       | Prometheus Bound | Works and Days
 Emmanouela  | Antigone         | 
 Emmanouela  | Oedipus Rex      | Antigone
 Emmanouela  | The Symposium    | Oedipus Rex
 Giorgos     | The Iliad        | 
 Giorgos     | The Odyssey      | The Iliad
 Giorgos     | The Republic     | The Odyssey
 Konstantina | The Iliad        | 
 Konstantina | The Odyssey      | The Iliad
 Konstantina | The Symposium    | The Odyssey
 Konstantina | The Orestia      | The Symposium
</code></pre>
<p>You can find a list of all the builtin window functions in the <a target="_blank" href="https://www.postgresql.org/docs/15/functions-aggregate.html">docs</a></p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>In this blog post, we covered the basics of PostgreSQL window functions. I showed you the basic syntax of window functions and how you can use them to rank data, keep running totals, find averages and check data around each row</p>
<p>In order to master window functions the key is practice. Get the test dataset or create your dataset, look up the documentation and try them with different examples.</p>
<h3 id="heading-references">References</h3>
<ul>
<li><p><a target="_blank" href="https://www.postgresql.org/docs/15/functions-window.html">https://www.postgresql.org/docs/15/functions-window.html</a></p>
</li>
<li><p><a target="_blank" href="https://www.postgresqltutorial.com/postgresql-window-function/">https://www.postgresqltutorial.com/postgresql-window-function/</a></p>
</li>
<li><p><a target="_blank" href="https://sqlzoo.net/wiki/Window_functions">https://sqlzoo.net/wiki/Window_functions</a></p>
</li>
</ul>
]]></content:encoded></item><item><title><![CDATA[Distributed Google Maps scraping]]></title><description><![CDATA[Introduction
In this post, I will show you how you can utilize the power of Kubernetes to scrape data from Google Maps without using an API key.
for the tutorial, I will use as an example deploying to

But this will work in any managed Kubernetes pro...]]></description><link>https://blog.gkomninos.com/distributed-google-maps-scraping</link><guid isPermaLink="true">https://blog.gkomninos.com/distributed-google-maps-scraping</guid><category><![CDATA[webscraping ]]></category><category><![CDATA[Google Maps Scraper]]></category><category><![CDATA[google maps crawler]]></category><category><![CDATA[golang]]></category><category><![CDATA[distributed scraping]]></category><dc:creator><![CDATA[Georgios Komninos]]></dc:creator><pubDate>Sun, 14 May 2023 15:10:45 GMT</pubDate><content:encoded><![CDATA[<h2 id="heading-introduction">Introduction</h2>
<p>In this post, I will show you how you can utilize the power of Kubernetes to scrape data from Google Maps without using an API key.</p>
<p>for the tutorial, I will use as an example deploying to</p>
<p><a target="_blank" href="https://www.digitalocean.com/?refcode=c11136c4693c&amp;utm_campaign=Referral_Invite&amp;utm_medium=Referral_Program&amp;utm_source=badge"><img src="https://web-platforms.sfo2.cdn.digitaloceanspaces.com/WWW/Badge%201.svg" alt="DigitalOcean Referral Badge" /></a></p>
<p>But this will work in any managed Kubernetes provider.</p>
<p>The whole procedure to get the scraper up and running won't take more than 20 minutes. So give it a try.</p>
<h3 id="heading-prerequisites">Prerequisites</h3>
<ul>
<li><p>Create a <a target="_blank" href="https://m.do.co/c/c11136c4693c">Digital Ocean Account</a> . I recommend if you do not have an account to create it via the <a target="_blank" href="https://m.do.co/c/c11136c4693c">referral link</a> . This way you get 200$ of credit and I may also get 25$ (depending if you continue using Digital Ocean). This way you can try the tutorial for Free .<br />  Note: To get the 200$ credit you need to add a payment method.</p>
</li>
<li><p>Install kubectl in your local machine. Follow the official <a target="_blank" href="https://kubernetes.io/docs/tasks/tools/#kubectl">instructions</a> .</p>
</li>
</ul>
<h2 id="heading-create-a-k8s-cluster">Create a K8s Cluster</h2>
<ol>
<li><p>Login to your Digital Ocean Account and click on the top Right Create.</p>
<p> <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1684072184794/c753a3e8-6dd1-4e15-9694-c7949c415a8f.png" alt="Digital Ocean Dashboard | Google maps scraper golang" class="image--center mx-auto" /></p>
</li>
</ol>
<p>In the menu that popups select: Kubernetes</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1684072328080/bc35fccc-e5be-4642-8460-ac14b27f1d20.png" alt="Digital Ocean Menu | Kubernetes Google Maps scraper" class="image--center mx-auto" /></p>
<p>After clicking Kubernetes the Kubernetes page opens:</p>
<p>For the purposes of the tutorial leave the defaults.<br />In a real life scenario you need to pick the desired region and configure the nodes you like.</p>
<p>Don't change the defaults for now. If you registered in Digital Ocean via the <a target="_blank" href="https://m.do.co/c/c11136c4693c">referral link</a> don't worry about costs for now. Additionally, keep in mind that since we are going to start headless web browser we need memory and CPU.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1684072712442/049956b5-2fbc-4098-add3-e81b34ba81b1.png" alt="Create a Kubernetes cluster in Digital Ocean | Setup Google Maps scraper on Kubernetes" class="image--center mx-auto" /></p>
<p>Please wait until the cluster initializes. This can take around 5 minutes.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1684072859988/41f0a5ce-e70d-4a0d-964d-c79d527bfe92.png" alt="Kubernetes Cluster is creating " class="image--center mx-auto" /></p>
<p>Once the cluster is provisioned then you have to download the kubernetes configuration file:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1684073123623/25f4bb00-0848-4ac7-88ec-ba4603b81bc5.png" alt="Get K8s configuration | How to setup google-maps-scraper in kubernetes" class="image--center mx-auto" /></p>
<p>Download the configuration file and take note of the location. For the purposes of the tutorial we assume that it is located at `/home/giorgos/k8s.config.yaml</p>
<p>Let's check that we can connect:</p>
<pre><code class="lang-bash">kubectl --kubeconfig=<span class="hljs-variable">$HOME</span>/k8s.config.yaml get pods &amp;&amp; <span class="hljs-built_in">echo</span> $?
</code></pre>
<p>You should get output like:</p>
<pre><code class="lang-bash">No resources found <span class="hljs-keyword">in</span> default namespace.
0
</code></pre>
<h2 id="heading-create-a-postgresql-database">Create a PostgreSQL database</h2>
<p>In your <a target="_blank" href="https://cloud.digitalocean.com/databases?i=61501f">Digital Ocean dashboard</a> click on the left panel <code>Databases</code> or follow this <a target="_blank" href="https://cloud.digitalocean.com/databases?i=61501f">create a database in Digital Ocean</a> .</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1684073802730/b5c56b1e-1628-4405-95ec-9db0bafd70aa.png" alt="Select PostgreSQL " class="image--center mx-auto" /></p>
<p>Select PostgresSQL database and in the next page leave the defaults (the lower tier).</p>
<p>Then click Create:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1684073990650/749c2e1c-68b5-45df-b462-77e8db2b8cfa.png" alt class="image--center mx-auto" /></p>
<p>Again wait a bit until it is provisioned.</p>
<p>Once the database is ready we need to:</p>
<ul>
<li>Create a User and a database</li>
</ul>
<h3 id="heading-create-a-user-and-a-database">Create a User and a database</h3>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1684074447574/a3d6d50b-5b26-4126-b67a-2574b7035d4d.png" alt="setup a database digital ocean for the google maps scraper" class="image--center mx-auto" /></p>
<p>First, open a terminal (or your favorite GUI tool) and connect to your database</p>
<pre><code class="lang-bash">psql -p 25060 -h db-postgresql-sfo3-81615-do-user-14100026-0.b.db.ondigitalocean.com -U doadmin -d defaultdb
</code></pre>
<p>(Please replace host with yours)</p>
<p>If you managed to connect then we can move to the next step.</p>
<h3 id="heading-create-tables">Create tables</h3>
<pre><code class="lang-bash">    CREATE TABLE gmaps_jobs(
        id UUID PRIMARY KEY,
        priority SMALLINT NOT NULL,
        payload_type TEXT NOT NULL,
        payload BYTEA NOT NULL,
        created_at TIMESTAMP WITH TIME ZONE NOT NULL,
        status TEXT NOT NULL
    );

    CREATE TABLE results(
        id INT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
        title TEXT NOT NULL,
        category TEXT NOT NULL,
        address TEXT NOT NULL,
        openhours TEXT NOT NULL,
        website TEXT NOT NULL,
        phone TEXT NOT NULL,
        pluscode TEXT  NOT NULL,
        review_count INT NOT NULL,
        rating NUMERIC NOT NULL
    );
</code></pre>
<p>Execute the above queries in your database client.</p>
<h2 id="heading-google-maps-scraper-deployment">Google maps scraper deployment</h2>
<p>First create a file with your queries. A sample is</p>
<pre><code class="lang-plaintext">bars in Athens
bars in Berlin
restaurants in Rome
</code></pre>
<p>Save this file in a file name queries.txt.</p>
<p>then:</p>
<pre><code class="lang-bash">docker run -v <span class="hljs-variable">$PWD</span>/queries.txt:/queries.txt gosom/google-maps-scraper:v0.9.3  -depth 5 -input /queries.txt  -dsn <span class="hljs-string">"postgres://doadmin:{yourPassword}@{yourHost}:25060/defaultdb"</span> -produce -lang en
</code></pre>
<p>(Replace with your password and your host)</p>
<p>be patient because the image is around 1GB so it needs to be downloaded</p>
<p>Once, the command finishes verify that the jobs are inserted to the database:</p>
<pre><code class="lang-sql"><span class="hljs-keyword">select</span> <span class="hljs-keyword">count</span>(<span class="hljs-number">1</span>) <span class="hljs-keyword">from</span> gmaps_jobs
</code></pre>
<p>Run the above query in your database client. It should return 3 if you use my sample file.</p>
<p>We are now ready to start our scrapers.</p>
<p>Create a file with the kubernetes deployment configuration named gmaps.deployment.yaml and paste the following:</p>
<pre><code class="lang-yaml"><span class="hljs-attr">apiVersion:</span> <span class="hljs-string">apps/v1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">Deployment</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">google-maps-scraper</span>
<span class="hljs-attr">spec:</span>
  <span class="hljs-attr">selector:</span>
    <span class="hljs-attr">matchLabels:</span>
      <span class="hljs-attr">app:</span> <span class="hljs-string">google-maps-scraper</span>
  <span class="hljs-attr">replicas:</span> <span class="hljs-number">2</span>
  <span class="hljs-attr">template:</span>
    <span class="hljs-attr">metadata:</span>
      <span class="hljs-attr">labels:</span>
        <span class="hljs-attr">app:</span> <span class="hljs-string">google-maps-scraper</span>
    <span class="hljs-attr">spec:</span>
      <span class="hljs-attr">containers:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">google-maps-scraper</span>
        <span class="hljs-attr">image:</span> <span class="hljs-string">gosom/google-maps-scraper:v0.9.3</span>
        <span class="hljs-attr">imagePullPolicy:</span> <span class="hljs-string">IfNotPresent</span>
        <span class="hljs-attr">args:</span> [<span class="hljs-string">"-c"</span>, <span class="hljs-string">"1"</span>, <span class="hljs-string">"-depth"</span>, <span class="hljs-string">"5"</span>, <span class="hljs-string">"-dsn"</span>, <span class="hljs-string">"postgres://doadmin:{YourPassword}@{YourHost}:25060/defaultdb"</span>]
</code></pre>
<p>(Edit your password and your host)</p>
<p>Then apply the configuration:  </p>
<pre><code class="lang-bash">kubectl --kubeconfig=<span class="hljs-variable">$HOME</span>/k8s.config.yaml apply -f gmaps.deployment.yaml
</code></pre>
<p>Give it some time, since the image needs to also get downloaded.</p>
<p>Check the status of the pods:</p>
<pre><code class="lang-bash">giorgos@gtp:~$ kubectl --kubeconfig=<span class="hljs-variable">$HOME</span>/k8s.config.yaml get pods
NAME                                   READY   STATUS    RESTARTS   AGE
google-maps-scraper-6489d96b84-7nltl   1/1     Running   0          68s
google-maps-scraper-6489d96b84-vvx6c   1/1     Running   0          116s
giorgos@gtp:~$
</code></pre>
<p>Meanwhile, check periodically the results table:</p>
<pre><code class="lang-sql"> <span class="hljs-keyword">select</span> <span class="hljs-keyword">count</span>(<span class="hljs-number">1</span>) <span class="hljs-keyword">from</span> results;
</code></pre>
<p>it will start slowly populating the results table.</p>
<pre><code class="lang-sql">defaultdb=&gt; <span class="hljs-keyword">select</span> * <span class="hljs-keyword">from</span> results <span class="hljs-keyword">limit</span> <span class="hljs-number">5</span>;
-[ RECORD 1 ]+<span class="hljs-comment">-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------</span>
id           | 1
title        | Athens Sports Bar
category     | Sports bar
address      | Veikou 3a, Athina 117 42, Greece
openhours    | Sunday, 10 AM to 12 AM; Monday, 10 AM to 12 AM; Tuesday, 10 AM to 12 AM; Wednesday, 10 AM to 12 AM; Thursday, 10 AM to 12 AM; Friday, 10 AM to 12 AM; Saturday, 10 AM to 12 AM. Hide open hours for the week
website      | http://www.athenssportsbar.gr/
phone        | +302109235811
pluscode     | XP8H+V9 Athens, Greece
review_count | 1
rating       | 4.4
-[ RECORD 2 ]+<span class="hljs-comment">-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------</span>
id           | 2
title        | 360 Cocktail bar
category     | Bar
address      | Ifestou 2, Athina 105 55, Greece
openhours    | Sunday, 9 AM to 4 AM; Monday, 9 AM to 3 AM; Tuesday, 9 AM to 3 AM; Wednesday, 9 AM to 3 AM; Thursday, 9 AM to 3 AM; Friday, 9 AM to 3 AM; Saturday, 9 AM to 4 AM. Hide open hours for the week
website      | http://www.three-sixty.gr/
phone        | +302103210006
pluscode     | XPGG+H6 Athens, Greece
review_count | 8
rating       | 4.4
-[ RECORD 3 ]+<span class="hljs-comment">-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------</span>
id           | 3
title        | Teddy Boy
category     | Bar
address      | Taki 18, Athina 105 54, Greece
openhours    | 
website      | https://m.facebook.com/teddyboy.bar
phone        | +306951116651
pluscode     | XPHF+8F Athens, Greece
review_count | 489
rating       | 4.5
-[ RECORD 4 ]+<span class="hljs-comment">-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------</span>
id           | 4
title        | Revolt street bar
category     | Bar
address      | Koletti 25-27, Athina 106 77, Greece
openhours    | Sunday, 11 AM to 2 AM; Monday, 11 AM to 2 AM; Tuesday, 11 AM to 2 AM; Wednesday, 11 AM to 2 AM; Thursday, 11 AM to 2 AM; Friday, 11 AM to 3 AM; Saturday, 11 AM to 3 AM. Hide open hours for the week
website      | https://www.facebook.com/Revoltstreetbar/
phone        | +302103800016
pluscode     | XPPM+85 Athens, Greece
review_count | 461
rating       | 4.5
-[ RECORD 5 ]+<span class="hljs-comment">-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------</span>
id           | 5
title        | 42 Barstronomy Athens
category     | Cocktail bar
address      | Kolokotroni 3, Athina 105 62, Greece
openhours    | 
website      | https://42barstronomy.gr/
phone        | +302130052153
pluscode     | XPGM+Q8 Athens, Greece
review_count | 1
rating       | 4.5

defaultdb=&gt;
</code></pre>
<h2 id="heading-conclusion">Conclusion</h2>
<p>In this tutorial, I showed you how you can use the <a target="_blank" href="https://github.com/gosom/google-maps-scraper">google-maps-scraper</a> in Kubernetes to automate and scale scraping Google Maps results.</p>
<p>Note: Please clean up the resources in your Digital Ocean account to avoid undesired charges once you are done with this tutorial</p>
]]></content:encoded></item><item><title><![CDATA[How to extract data from Google maps using 
Golang]]></title><description><![CDATA[This post introduces a command-line application that allows you to extract data from Google Maps.
If you want to extract to collect some data for local businesses using Google Maps then this tool is for you.
The tools retrieve and export to a CSV fil...]]></description><link>https://blog.gkomninos.com/how-to-extract-data-from-google-maps-using-golang</link><guid isPermaLink="true">https://blog.gkomninos.com/how-to-extract-data-from-google-maps-using-golang</guid><category><![CDATA[Scraping]]></category><category><![CDATA[google maps crawler]]></category><category><![CDATA[data scraping]]></category><category><![CDATA[marketing]]></category><dc:creator><![CDATA[Georgios Komninos]]></dc:creator><pubDate>Sun, 07 May 2023 07:26:23 GMT</pubDate><content:encoded><![CDATA[<p>This post introduces a command-line application that allows you to extract data from Google Maps.</p>
<p>If you want to extract to collect some data for local businesses using Google Maps then this tool is for you.</p>
<p>The tools retrieve and export to a CSV file the following data:</p>
<ul>
<li><p>Title: the title of the business</p>
</li>
<li><p>Category: the category of the business</p>
</li>
<li><p>Address: the address of the business</p>
</li>
<li><p>OpenHours: the opening hours of the business</p>
</li>
<li><p>Website: the website of the business</p>
</li>
<li><p>Phone: the phone number of the business</p>
</li>
<li><p>PlusCode: the plus code of the business</p>
</li>
<li><p>ReviewCount: the number of reviews</p>
</li>
<li><p>ReviewRating: the rating of the results</p>
</li>
</ul>
<h2 id="heading-getting-started">Getting started</h2>
<ol>
<li><p>create a file with your queries like:</p>
<pre><code class="lang-bash"> bars <span class="hljs-keyword">in</span> Athens
 doctor <span class="hljs-keyword">in</span> Berlin
 doctor <span class="hljs-keyword">in</span> Bonn
</code></pre>
</li>
<li><p>Make sure you have Docker installed</p>
</li>
<li><pre><code class="lang-bash"> docker run -v <span class="hljs-variable">$PWD</span>/example-queries.txt:/example-queries -v <span class="hljs-variable">$PWD</span>:/results gosom/google-maps-scraper -depth 1 -input /example-queries -results /results/result-file.csv
</code></pre>
</li>
<li><p>wait for it to finish. The program does not exits automatically. When there are no updates in the console for some time you can hit CTRL-C. Meanwhile you will see that the file result-file.csv will be populated</p>
</li>
<li><p>Results will be to the file result-file.csv</p>
</li>
</ol>
<p>Notes: Please adjust the filenames to the desired ones.</p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>In this blog post, I introduced <a target="_blank" href="https://github.com/gosom/google-maps-scraper">google-maps-scraper</a> command line tool.</p>
<p>The tool is built using the Go programming language and it uses <a target="_blank" href="https://github.com/gosom/scrapemate">scrapemate</a> .</p>
<p>All the code is on <a target="_blank" href="https://github.com/gosom/google-maps-scraper">Github</a>. Feel free to create an Issue or leave a comment here if you run into problems or you have some ideas for extension.</p>
]]></content:encoded></item><item><title><![CDATA[Golang web scraping using Scrapemate]]></title><description><![CDATA[Introduction
In this blog post, we are going to use scrapemate to extract hockey teams form
from the website https://www.scrapethissite.com/pages/forms
This website contains sandboxes for testing your scrapers, so no real data.
You can find the full ...]]></description><link>https://blog.gkomninos.com/golang-web-scraping-using-scrapemate</link><guid isPermaLink="true">https://blog.gkomninos.com/golang-web-scraping-using-scrapemate</guid><category><![CDATA[web scraping]]></category><category><![CDATA[golang]]></category><category><![CDATA[data extraction]]></category><category><![CDATA[scraping framework]]></category><dc:creator><![CDATA[Georgios Komninos]]></dc:creator><pubDate>Sat, 15 Apr 2023 15:58:59 GMT</pubDate><content:encoded><![CDATA[<h2 id="heading-introduction">Introduction</h2>
<p>In this blog post, we are going to use <a target="_blank" href="https://github.com/gosom/scrapemate">scrapemate</a> to extract hockey teams form</p>
<p>from the website <a target="_blank" href="https://www.scrapethissite.com/pages/forms/?page_num=1&amp;per_page=100">https://www.scrapethissite.com/pages/forms</a></p>
<p>This website contains sandboxes for testing your scrapers, so no real data.</p>
<p>You can find the full code on <a target="_blank" href="https://github.com/gosom/scrapemate-highlevel-api-example">github</a></p>
<p>The <a target="_blank" href="https://blog.gkomninos.com/getting-started-with-web-scraping-using-golang-and-scrapemate">previous post</a> uses the low lever API from scrapemate, this one uses the high level API</p>
<h2 id="heading-code-skeleton">Code Skeleton</h2>
<p>Create a folder named scrapemate-highlevel-api-example</p>
<pre><code class="lang-bash">mkdir scrapemate-highlevel-api-example
<span class="hljs-built_in">cd</span> scrapemate-highlevel-api-example
</code></pre>
<p>the initialize a go module</p>
<pre><code class="lang-bash">go mod init github.com/gosom/scrapemate-highlevel-api-example
</code></pre>
<p>Create 2 folders:</p>
<ul>
<li><p><code>hockey</code></p>
</li>
<li><p><code>testdata</code></p>
</li>
</ul>
<pre><code class="lang-bash">mkdir hockey testdata
</code></pre>
<h2 id="heading-parser">Parser</h2>
<p>Now we need to figure out how we are going to parse the data from the website.</p>
<p>Scrapemate high level API is using goquery and CSS-selectors. You can use another html parsing library by utilizing the low level API if you like.</p>
<p>Create a file <code>team.go</code> in the <code>hockey</code> directory</p>
<pre><code class="lang-bash">touch hockey/team.go
</code></pre>
<p>and copy the following:</p>
<pre><code class="lang-go"><span class="hljs-keyword">package</span> hockey

<span class="hljs-keyword">import</span> (
    <span class="hljs-string">"strconv"</span>
    <span class="hljs-string">"strings"</span>

    <span class="hljs-string">"github.com/PuerkitoBio/goquery"</span>
)

<span class="hljs-keyword">type</span> Team <span class="hljs-keyword">struct</span> {
    Name         <span class="hljs-keyword">string</span>
    Year         <span class="hljs-keyword">int</span>
    Wins         <span class="hljs-keyword">int</span>
    Losses       <span class="hljs-keyword">int</span>
    OTLosses     <span class="hljs-keyword">int</span>
    WinPct       <span class="hljs-keyword">float64</span>
    GoalsFor     <span class="hljs-keyword">int</span>
    GoalsAgainst <span class="hljs-keyword">int</span>
    GoalDiff     <span class="hljs-keyword">int</span>
}

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-params">(t Team)</span> <span class="hljs-title">CsvHeaders</span><span class="hljs-params">()</span> []<span class="hljs-title">string</span></span> {
    <span class="hljs-keyword">return</span> []<span class="hljs-keyword">string</span>{
        <span class="hljs-string">"Name"</span>,
        <span class="hljs-string">"Year"</span>,
        <span class="hljs-string">"Wins"</span>,
        <span class="hljs-string">"Losses"</span>,
        <span class="hljs-string">"OTLosses"</span>,
        <span class="hljs-string">"WinPct"</span>,
        <span class="hljs-string">"GoalsFor"</span>,
        <span class="hljs-string">"GoalsAgainst"</span>,
        <span class="hljs-string">"GoalDiff"</span>,
    }
}

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-params">(t Team)</span> <span class="hljs-title">CsvRow</span><span class="hljs-params">()</span> []<span class="hljs-title">string</span></span> {
    <span class="hljs-keyword">return</span> []<span class="hljs-keyword">string</span>{
        t.Name,
        strconv.Itoa(t.Year),
        strconv.Itoa(t.Wins),
        strconv.Itoa(t.Losses),
        strconv.Itoa(t.OTLosses),
        strconv.FormatFloat(t.WinPct, <span class="hljs-string">'f'</span>, <span class="hljs-number">2</span>, <span class="hljs-number">64</span>),
        strconv.Itoa(t.GoalsFor),
        strconv.Itoa(t.GoalsAgainst),
        strconv.Itoa(t.GoalDiff),
    }
}

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">parseTeams</span><span class="hljs-params">(doc *goquery.Document)</span> <span class="hljs-params">([]Team, error)</span></span> {
    sel := <span class="hljs-string">"table.table tr.team"</span>
    <span class="hljs-keyword">var</span> teams []Team
    doc.Find(sel).Each(<span class="hljs-function"><span class="hljs-keyword">func</span><span class="hljs-params">(i <span class="hljs-keyword">int</span>, s *goquery.Selection)</span></span> {
        teams = <span class="hljs-built_in">append</span>(teams, parseTeam(s))
    })
    <span class="hljs-keyword">return</span> teams, <span class="hljs-literal">nil</span>
}

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">parseTeam</span><span class="hljs-params">(s *goquery.Selection)</span> <span class="hljs-title">Team</span></span> {
    <span class="hljs-keyword">var</span> team Team
    team.Name = cleanText(s.Find(<span class="hljs-string">"td.name"</span>).Text())
    team.Year = parseInt(s.Find(<span class="hljs-string">"td.year"</span>).Text())
    team.Wins = parseInt(s.Find(<span class="hljs-string">"td.wins"</span>).Text())
    team.Losses = parseInt(s.Find(<span class="hljs-string">"td.losses"</span>).Text())
    team.OTLosses = parseInt(s.Find(<span class="hljs-string">"td.ot-losses"</span>).Text())
    team.WinPct = parseFloat(s.Find(<span class="hljs-string">"td.pct"</span>).Text())
    team.GoalsFor = parseInt(s.Find(<span class="hljs-string">"td.gf"</span>).Text())
    team.GoalsAgainst = parseInt(s.Find(<span class="hljs-string">"td.ga"</span>).Text())
    team.GoalDiff = parseInt(s.Find(<span class="hljs-string">"td.diff"</span>).Text())
    <span class="hljs-keyword">return</span> team
}

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">parseNextLink</span><span class="hljs-params">(doc *goquery.Document)</span> <span class="hljs-params">(<span class="hljs-keyword">string</span>, <span class="hljs-keyword">map</span>[<span class="hljs-keyword">string</span>]<span class="hljs-keyword">string</span>)</span></span> {
    sel := <span class="hljs-string">"ul.pagination&gt;li:last-child&gt;a[aria-label=Next]"</span>
    s := doc.Find(sel).AttrOr(<span class="hljs-string">"href"</span>, <span class="hljs-string">""</span>)
    <span class="hljs-keyword">if</span> s == <span class="hljs-string">""</span> {
        <span class="hljs-keyword">return</span> <span class="hljs-string">""</span>, <span class="hljs-literal">nil</span>
    }
    s = <span class="hljs-string">"https://www.scrapethissite.com"</span> + s
    parts := strings.Split(s, <span class="hljs-string">"?"</span>)
    nextLink := parts[<span class="hljs-number">0</span>]
    params := <span class="hljs-built_in">make</span>(<span class="hljs-keyword">map</span>[<span class="hljs-keyword">string</span>]<span class="hljs-keyword">string</span>)
    <span class="hljs-keyword">for</span> _, p := <span class="hljs-keyword">range</span> strings.Split(parts[<span class="hljs-number">1</span>], <span class="hljs-string">"&amp;"</span>) {
        kv := strings.Split(p, <span class="hljs-string">"="</span>)
        params[kv[<span class="hljs-number">0</span>]] = kv[<span class="hljs-number">1</span>]
    }
    <span class="hljs-keyword">return</span> nextLink, params
}

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">cleanText</span><span class="hljs-params">(s <span class="hljs-keyword">string</span>)</span> <span class="hljs-title">string</span></span> {
    s = strings.TrimFunc(s, <span class="hljs-function"><span class="hljs-keyword">func</span><span class="hljs-params">(r <span class="hljs-keyword">rune</span>)</span> <span class="hljs-title">bool</span></span> {
        <span class="hljs-keyword">return</span> r == <span class="hljs-string">'\n'</span>
    })
    <span class="hljs-keyword">return</span> strings.TrimSpace(s)
}

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">parseInt</span><span class="hljs-params">(s <span class="hljs-keyword">string</span>)</span> <span class="hljs-title">int</span></span> {
    s = cleanText(s)
    <span class="hljs-keyword">if</span> s == <span class="hljs-string">""</span> {
        <span class="hljs-keyword">return</span> <span class="hljs-number">0</span>
    }
    ans, _ := strconv.Atoi(s)
    <span class="hljs-keyword">return</span> ans
}

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">parseFloat</span><span class="hljs-params">(s <span class="hljs-keyword">string</span>)</span> <span class="hljs-title">float64</span></span> {
    s = cleanText(s)
    <span class="hljs-keyword">if</span> s == <span class="hljs-string">""</span> {
        <span class="hljs-keyword">return</span> <span class="hljs-number">0</span>
    }
    ans, _ := strconv.ParseFloat(s, <span class="hljs-number">64</span>)
    <span class="hljs-keyword">return</span> ans
}
</code></pre>
<p>The code above is straightforward.</p>
<p>The most important functions are:</p>
<ul>
<li><p><code>parseTeams</code> : returns a list of <code>Team</code> structs with the attributes populated</p>
</li>
<li><p><code>parseNextLink</code>: returns the next link in two parts the url and the url params</p>
</li>
</ul>
<p>We need to test that we parse properly. Let's create some unit tests.</p>
<p>But first download into testdata the website:</p>
<pre><code class="lang-bash"> curl -o testdata/teams.html <span class="hljs-string">'https://www.scrapethissite.com/pages/forms/?page_num=1&amp;per_page=100'</span>
</code></pre>
<p>create a file <code>hockey/team_test.go</code> and paste the following:</p>
<pre><code class="lang-go"><span class="hljs-keyword">package</span> hockey

<span class="hljs-keyword">import</span> (
    <span class="hljs-string">"os"</span>
    <span class="hljs-string">"testing"</span>

    <span class="hljs-string">"github.com/PuerkitoBio/goquery"</span>
    <span class="hljs-string">"github.com/stretchr/testify/require"</span>
)

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">Test_parseTeams</span><span class="hljs-params">(t *testing.T)</span></span> {
    fd, err := os.Open(<span class="hljs-string">"../testdata/teams.html"</span>)
    require.NoError(t, err)
    <span class="hljs-keyword">defer</span> fd.Close()
    doc, err := goquery.NewDocumentFromReader(fd)
    require.NoError(t, err)

    teams, err := parseTeams(doc)
    require.NoError(t, err)
    require.Equal(t, <span class="hljs-number">100</span>, <span class="hljs-built_in">len</span>(teams))

    team := teams[<span class="hljs-number">0</span>]
    require.Equal(t, <span class="hljs-string">"Boston Bruins"</span>, team.Name)
    require.Equal(t, <span class="hljs-number">1990</span>, team.Year)
    require.Equal(t, <span class="hljs-number">44</span>, team.Wins)
    require.Equal(t, <span class="hljs-number">24</span>, team.Losses)
    require.Equal(t, <span class="hljs-number">0</span>, team.OTLosses)
    require.Equal(t, <span class="hljs-number">0.55</span>, team.WinPct)
    require.Equal(t, <span class="hljs-number">299</span>, team.GoalsFor)
    require.Equal(t, <span class="hljs-number">264</span>, team.GoalsAgainst)
    require.Equal(t, <span class="hljs-number">35</span>, team.GoalDiff)
}

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">Test_parseNextLink</span><span class="hljs-params">(t *testing.T)</span></span> {
    fd, err := os.Open(<span class="hljs-string">"../testdata/teams.html"</span>)
    require.NoError(t, err)
    <span class="hljs-keyword">defer</span> fd.Close()
    doc, err := goquery.NewDocumentFromReader(fd)
    require.NoError(t, err)

    nextLink, params := parseNextLink(doc)
    require.Equal(t, <span class="hljs-string">"https://www.scrapethissite.com/pages/forms/"</span>, nextLink)
    require.Equal(t, <span class="hljs-string">"2"</span>, params[<span class="hljs-string">"page_num"</span>])
    require.Equal(t, <span class="hljs-string">"100"</span>, params[<span class="hljs-string">"per_page"</span>])
}
</code></pre>
<p>Run</p>
<pre><code class="lang-bash">go mod tidy
</code></pre>
<p>and then run the unit tests</p>
<pre><code class="lang-go"><span class="hljs-keyword">go</span> test -v ./...
</code></pre>
<p>Tests must pass.</p>
<h2 id="heading-scraping-job-definition">Scraping Job definition</h2>
<p>create a file <code>hockey/collect.go</code></p>
<pre><code class="lang-bash">touch hockey/collect.go
</code></pre>
<p>and paste the following</p>
<pre><code class="lang-go"><span class="hljs-keyword">package</span> hockey

<span class="hljs-keyword">import</span> (
    <span class="hljs-string">"context"</span>
    <span class="hljs-string">"fmt"</span>
    <span class="hljs-string">"net/http"</span>
    <span class="hljs-string">"time"</span>

    <span class="hljs-string">"github.com/PuerkitoBio/goquery"</span>
    <span class="hljs-string">"github.com/google/uuid"</span>
    <span class="hljs-string">"github.com/gosom/scrapemate"</span>
)

<span class="hljs-keyword">type</span> TeamCollectJob <span class="hljs-keyword">struct</span> {
    scrapemate.Job
}

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">NewTeamCollectJob</span><span class="hljs-params">(u <span class="hljs-keyword">string</span>, params <span class="hljs-keyword">map</span>[<span class="hljs-keyword">string</span>]<span class="hljs-keyword">string</span>)</span> *<span class="hljs-title">TeamCollectJob</span></span> {
    <span class="hljs-keyword">return</span> &amp;TeamCollectJob{
        Job: scrapemate.Job{
            <span class="hljs-comment">// just give it a random id</span>
            ID:        uuid.New().String(),
            Method:    http.MethodGet,
            URL:       u,
            UrlParams: params,
            Headers: <span class="hljs-keyword">map</span>[<span class="hljs-keyword">string</span>]<span class="hljs-keyword">string</span>{
                <span class="hljs-string">"User-Agent"</span>: scrapemate.DefaultUserAgent,
            },
            Timeout:    <span class="hljs-number">10</span> * time.Second,
            MaxRetries: <span class="hljs-number">3</span>,
        },
    }
}

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-params">(o *TeamCollectJob)</span> <span class="hljs-title">Process</span><span class="hljs-params">(ctx context.Context, resp *scrapemate.Response)</span> <span class="hljs-params">(any, []scrapemate.IJob, error)</span></span> {
    doc, ok := resp.Document.(*goquery.Document)
    <span class="hljs-keyword">if</span> !ok {
        <span class="hljs-keyword">return</span> <span class="hljs-literal">nil</span>, <span class="hljs-literal">nil</span>, fmt.Errorf(<span class="hljs-string">"invalid document type %T expected *goquery.Document"</span>, resp.Document)
    }
    teams, err := parseTeams(doc)
    <span class="hljs-keyword">if</span> err != <span class="hljs-literal">nil</span> {
        <span class="hljs-keyword">return</span> <span class="hljs-literal">nil</span>, <span class="hljs-literal">nil</span>, err
    }

    <span class="hljs-keyword">var</span> nextJobs []scrapemate.IJob

    nextLink, params := parseNextLink(doc)
    <span class="hljs-keyword">if</span> nextLink != <span class="hljs-string">""</span> {
        nextJobs = <span class="hljs-built_in">append</span>(nextJobs, NewTeamCollectJob(nextLink, params))
    }

    <span class="hljs-keyword">return</span> teams, nextJobs, <span class="hljs-literal">nil</span>
}
</code></pre>
<p>Here we define the scraping job (<code>TeamCollectJob</code>) and the <code>Process</code> method.</p>
<p>The <code>Process</code> method returns three things:</p>
<ul>
<li><p>the result (here a slice of <code>Teams</code>)</p>
</li>
<li><p>the next jobs (the job defined by the next link in pagination)</p>
</li>
<li><p>an error if it occurs</p>
</li>
</ul>
<p>Run</p>
<pre><code class="lang-bash">go mod tidy
</code></pre>
<h2 id="heading-the-main-function">the main function</h2>
<p>create a file <code>main.go</code></p>
<pre><code class="lang-go">touch main.<span class="hljs-keyword">go</span>
</code></pre>
<p>and paste the contents</p>
<pre><code class="lang-go"><span class="hljs-keyword">package</span> main

<span class="hljs-keyword">import</span> (
    <span class="hljs-string">"context"</span>
    <span class="hljs-string">"encoding/csv"</span>
    <span class="hljs-string">"os"</span>

    <span class="hljs-string">"github.com/gosom/scrapemate"</span>
    <span class="hljs-string">"github.com/gosom/scrapemate-highlevel-api-example/hockey"</span>
    <span class="hljs-string">"github.com/gosom/scrapemate/adapters/writers/csvwriter"</span>
    <span class="hljs-string">"github.com/gosom/scrapemate/scrapemateapp"</span>
)

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">main</span><span class="hljs-params">()</span></span> {
    <span class="hljs-keyword">if</span> err := run(); err != <span class="hljs-literal">nil</span> {
        os.Stderr.WriteString(err.Error() + <span class="hljs-string">"\n"</span>)
        os.Exit(<span class="hljs-number">1</span>)
        <span class="hljs-keyword">return</span>
    }
    os.Exit(<span class="hljs-number">0</span>)
}

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">run</span><span class="hljs-params">()</span> <span class="hljs-title">error</span></span> {
    csvWriter := csvwriter.NewCsvWriter(csv.NewWriter(os.Stdout))

    writers := []scrapemate.ResultWriter{
        csvWriter,
    }

    cfg, err := scrapemateapp.NewConfig(writers)
    <span class="hljs-keyword">if</span> err != <span class="hljs-literal">nil</span> {
        <span class="hljs-keyword">return</span> err
    }
    app, err := scrapemateapp.NewScrapeMateApp(cfg)
    <span class="hljs-keyword">if</span> err != <span class="hljs-literal">nil</span> {
        <span class="hljs-keyword">return</span> err
    }
    params := <span class="hljs-keyword">map</span>[<span class="hljs-keyword">string</span>]<span class="hljs-keyword">string</span>{
        <span class="hljs-string">"page_num"</span>: <span class="hljs-string">"1"</span>,
        <span class="hljs-string">"per_page"</span>: <span class="hljs-string">"100"</span>,
    }
    seedJobs := []scrapemate.IJob{
        hockey.NewTeamCollectJob(<span class="hljs-string">"https://www.scrapethissite.com/pages/forms/"</span>, params),
    }
    <span class="hljs-keyword">return</span> app.Start(context.Background(), seedJobs...)
}
</code></pre>
<p>Here we define a csvwriter that writes to stdout</p>
<pre><code class="lang-go">csvWriter := csvwriter.NewCsvWriter(csv.NewWriter(os.Stdout))

writers := []scrapemate.ResultWriter{
    csvWriter,
}
</code></pre>
<p>Then we initialize our scraper:</p>
<pre><code class="lang-go">cfg, err := scrapemateapp.NewConfig(writers)
    <span class="hljs-keyword">if</span> err != <span class="hljs-literal">nil</span> {
        <span class="hljs-keyword">return</span> err
    }
    app, err := scrapemateapp.NewScrapeMateApp(cfg)
    <span class="hljs-keyword">if</span> err != <span class="hljs-literal">nil</span> {
        <span class="hljs-keyword">return</span> err
    }
</code></pre>
<p>Finally, we create a seed job (the one that our scraper will use to start).</p>
<p>and start the scraper</p>
<pre><code class="lang-go">params := <span class="hljs-keyword">map</span>[<span class="hljs-keyword">string</span>]<span class="hljs-keyword">string</span>{
        <span class="hljs-string">"page_num"</span>: <span class="hljs-string">"1"</span>,
        <span class="hljs-string">"per_page"</span>: <span class="hljs-string">"100"</span>,
    }
    seedJobs := []scrapemate.IJob{
        hockey.NewTeamCollectJob(<span class="hljs-string">"https://www.scrapethissite.com/pages/forms/"</span>, params),
    }
    <span class="hljs-keyword">return</span> app.Start(context.Background(), seedJobs...)
</code></pre>
<p>Run</p>
<pre><code class="lang-go"><span class="hljs-keyword">go</span> mod tidy
</code></pre>
<h2 id="heading-run-the-scraper">Run the scraper</h2>
<p>In order to run the scraper just do:</p>
<pre><code class="lang-go"> <span class="hljs-keyword">go</span> run main.<span class="hljs-keyword">go</span> <span class="hljs-number">1</span>&gt;hockey.csv
</code></pre>
<p>After all the 6 pages are crawled you may stop the scraper using CTRL-C</p>
<p>the results must be in hockey.csv</p>
<h2 id="heading-summary">Summary</h2>
<p><a target="_blank" href="https://github.com/gosom/scrapemate">Scrapemate</a> is a web scraping framework written in Golang. In this post, we demonstrated how easy it is to create a scraper and save the results in a CSV file.</p>
<p>Read the blog post for the low level API <a target="_blank" href="https://blog.gkomninos.com/getting-started-with-web-scraping-using-golang-and-scrapemate">here</a> and see how easier is to scrape using the high level API.</p>
<p>See the full example <a target="_blank" href="https://github.com/gosom/scrapemate-highlevel-api-example">here</a> and see another example on <a target="_blank" href="https://github.com/gosom/scrapemate/tree/main/examples/quotes-to-scrape-app">github examples</a></p>
]]></content:encoded></item><item><title><![CDATA[Getting Started with Web Scraping Using Golang and Scrapemate]]></title><description><![CDATA[Introduction
Web scraping is the process of extracting data from websites, and it can be a powerful tool for collecting information for research, analysis, or automation.
In this tutorial, I will show you how to use Golang and the Scrapemate framewor...]]></description><link>https://blog.gkomninos.com/getting-started-with-web-scraping-using-golang-and-scrapemate</link><guid isPermaLink="true">https://blog.gkomninos.com/getting-started-with-web-scraping-using-golang-and-scrapemate</guid><category><![CDATA[golang]]></category><category><![CDATA[Scraping]]></category><category><![CDATA[webscraping ]]></category><category><![CDATA[web scraping]]></category><category><![CDATA[Tutorial]]></category><dc:creator><![CDATA[Georgios Komninos]]></dc:creator><pubDate>Fri, 14 Apr 2023 07:50:30 GMT</pubDate><content:encoded><![CDATA[<h2 id="heading-introduction"><strong>Introduction</strong></h2>
<p>Web scraping is the process of extracting data from websites, and it can be a powerful tool for collecting information for research, analysis, or automation.</p>
<p>In this tutorial, I will show you how to use Golang and the <a target="_blank" href="https://github.com/gosom/scrapemate">Scrapemate</a> framework to scrape data from a website.</p>
<p>As an example, we will extract product information from <a target="_blank" href="https://scrapeme.live/shop/"><strong>https://scrapeme.live/shop/</strong></a>. Specifically, we'll extract</p>
<ul>
<li><p>title</p>
</li>
<li><p>price</p>
</li>
<li><p>short_description</p>
</li>
<li><p>sku</p>
</li>
<li><p>categories</p>
</li>
</ul>
<p>for each of the Pokemon products on the site.</p>
<p>Once we have this data, we'll create a CSV file containing it.</p>
<p>See the image below with highlighted the data we need to extract for each product</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1681272581108/f7fe86e5-a6f9-493d-96f9-0b3edeab4d22.png" alt="screenshot of scrapeme.live website that highlights the data we have to scrape" class="image--center mx-auto" /></p>
<h2 id="heading-prerequisites"><strong>Prerequisites</strong></h2>
<p>Before you get started make sure that you have a Golang version &gt;=1.20 installed. You can find installation instructions <a target="_blank" href="https://go.dev/doc/install">here</a></p>
<h2 id="heading-step1-inspecting-the-website">Step1: Inspecting the website</h2>
<p>The first step in any web scraping project is to inspect the website you want to scrape. In our case, we want to extract data about Pokemon products from <a target="_blank" href="https://scrapeme.live/shop/"><strong>https://scrapeme.live/shop/</strong></a>.</p>
<h3 id="heading-finding-the-css-selectors-from-the-home-page">Finding the css-selectors from the home page</h3>
<p>Open <a target="_blank" href="https://scrapeme.live/shop/">https://scrapeme.live/shop/</a> in your browser. This opens the home page of the Pokemon e-shop.</p>
<p>This will be the starting page for our scraper.<br />Our scraper should try to find all the links to the products (pokemon in that case) and additionally, it should find the link to the next page.</p>
<p>The idea is that we should visit all the pages via pagination and for each page we extract the Pokemon product links. We also visit them and extract the information.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1681274772085/a171ee1b-4424-48f1-8918-1a2ba53c3a14.png" alt class="image--center mx-auto" /></p>
<p>By using Chrome's Developer tools and by right-clicking the element we are interested in we can find a suitable CSS selector so we can extract the information in our parser.</p>
<p>In the image above I show you how can you do it in Chrome:</p>
<ul>
<li><p>Hit F12 to open developer tools</p>
</li>
<li><p>Right click the <code>next</code> arrow that takes you to the next page and click inspect element</p>
</li>
<li><p>then in the developer's tools you can find the proper CSS selector</p>
</li>
</ul>
<p>In our case to get the page to the next link we need to use the element</p>
<p>with the following CSS selector:</p>
<pre><code class="lang-bash">a.next.page-numbers
</code></pre>
<p>By right-clicking on a product (pokemon) image we can inspect the element and find the element that contains that lint that takes us to the detailed product page.<br />In our case it's</p>
<pre><code class="lang-bash">a.woocommerce-LoopProduct-link
</code></pre>
<h3 id="heading-finding-the-css-selectors-for-the-product-details-page">finding the CSS selectors for the product details page</h3>
<p>Now visit a product detail page like <a target="_blank" href="https://scrapeme.live/shop/Bulbasaur/">https://scrapeme.live/shop/Bulbasaur/</a></p>
<p>Similarly, we find the CSS selectors for the elements we are interested in.</p>
<p>In particular, we have:</p>
<ul>
<li><p>title : <code>h1.product_title</code></p>
</li>
<li><p>price: <code>p.price</code></p>
</li>
<li><p>short_description: <code>div.woocommerce-product-details__short-description&gt;p</code></p>
</li>
<li><p>sku: <code>span.sku</code></p>
</li>
<li><p>categories: <code>div.product_meta &gt; span.posted_in &gt; a</code></p>
</li>
<li><p>tags: <code>div.product_meta &gt; span.tagged_as &gt; a</code></p>
</li>
</ul>
<h2 id="heading-step2-create-project-skeleton">Step2: Create project skeleton</h2>
<p>You need to create a folder that will host your code:</p>
<pre><code class="lang-bash">
mkdir scrapemelive
<span class="hljs-built_in">cd</span> scrapemelive
</code></pre>
<p>initialize a go module using <code>go mod init</code></p>
<pre><code class="lang-bash">go mod init scrapemelive
</code></pre>
<p>Then create the following folders:</p>
<pre><code class="lang-bash">mkdir scrapemelive
mkdir testdata
</code></pre>
<p>the contents of the folder should be:</p>
<pre><code class="lang-bash">├── go.mod
├── scrapemelive
└── testdata
</code></pre>
<p>Add also a <code>main.go</code> file which just prints 'hello world' for the moment.</p>
<pre><code class="lang-bash">touch main.go
</code></pre>
<p>open <code>main.go</code> into your editor and add the following:</p>
<pre><code class="lang-go"><span class="hljs-keyword">package</span> main

<span class="hljs-keyword">import</span> <span class="hljs-string">"fmt"</span>

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">main</span><span class="hljs-params">()</span></span> {
        fmt.Println(<span class="hljs-string">"hello world"</span>)
}
</code></pre>
<p>Let's test that the code can run by running <code>go run main.go</code> .<br />You should see in the standard output the <code>hello world</code> printed.</p>
<p>The <code>main.go</code> file will contain the code that will start our scraper and the code that writes the results to CSV.</p>
<p>The folder <code>scrapemelive</code> will contain the necessary functions/types/code that the framework will use</p>
<p>The folder <code>testdata</code> will contain data that we need for our unit tests.</p>
<h2 id="heading-step-3-writing-the-scraping-code"><strong>Step 3: Writing the Scraping Code</strong></h2>
<p>Let's first create a file in <code>scrapemelive/product.go</code> . Here we are going to add a struct that holds the Product data that we scrape.</p>
<pre><code class="lang-go"><span class="hljs-keyword">package</span> scrapemelive

<span class="hljs-comment">// Product is a product scraped from the detail page</span>
<span class="hljs-keyword">type</span> Product <span class="hljs-keyword">struct</span> {
        <span class="hljs-comment">// Name is the name of the product</span>
        Title <span class="hljs-keyword">string</span>
        <span class="hljs-comment">// Price is the price of the product</span>
        Price <span class="hljs-keyword">string</span>
        <span class="hljs-comment">// ShortDescription is the short description of the product</span>
        ShortDescription <span class="hljs-keyword">string</span>
        <span class="hljs-comment">// Sku is the sku of the product</span>
        Sku <span class="hljs-keyword">string</span>
        <span class="hljs-comment">// Categories is the categories of the product</span>
        Categories []<span class="hljs-keyword">string</span>
        <span class="hljs-comment">// Tags is the tags of the product</span>
        Tags []<span class="hljs-keyword">string</span>
}
</code></pre>
<p>Each scraped Product will be an instance of the <code>Product</code> struct.</p>
<p>Let's also create a <code>scrapemelive/product_test.go</code> file in which we will write our unit tests. For now, it will be empty</p>
<pre><code class="lang-go"><span class="hljs-keyword">package</span> scrapemelive
</code></pre>
<p>Our project folder should look like this</p>
<pre><code class="lang-bash">├── go.mod
├── main.go
├── scrapemelive
│   ├── product.go
│   └── product_test.go
└── testdata
</code></pre>
<p>The next step is to write the functions that extract the data from the detail page based on the css selectors we identified above.</p>
<p>Open the <code>scrapemelive/product.go</code> file and add the following</p>
<pre><code class="lang-go">
<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">parseTitle</span><span class="hljs-params">(doc *goquery.Document)</span> <span class="hljs-title">string</span></span> {
        <span class="hljs-keyword">return</span> doc.Find(<span class="hljs-string">"h1.product_title"</span>).Text()
}

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">parsePrice</span><span class="hljs-params">(doc *goquery.Document)</span> <span class="hljs-title">string</span></span> {
        <span class="hljs-keyword">return</span> doc.Find(<span class="hljs-string">"p.price"</span>).Text()
}

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">parseShortDescription</span><span class="hljs-params">(doc *goquery.Document)</span> <span class="hljs-title">string</span></span> {
        <span class="hljs-keyword">return</span> doc.Find(<span class="hljs-string">"div.woocommerce-product-details__short-description&gt;p"</span>).Text()
}

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">parseSku</span><span class="hljs-params">(doc *goquery.Document)</span> <span class="hljs-title">string</span></span> {
        <span class="hljs-keyword">return</span> doc.Find(<span class="hljs-string">"span.sku"</span>).Text()
}

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">parseCategories</span><span class="hljs-params">(doc *goquery.Document)</span> []<span class="hljs-title">string</span></span> {
        <span class="hljs-keyword">var</span> categories []<span class="hljs-keyword">string</span>
        doc.Find(<span class="hljs-string">"div.product_meta &gt; span.posted_in &gt; a"</span>).Each(<span class="hljs-function"><span class="hljs-keyword">func</span><span class="hljs-params">(i <span class="hljs-keyword">int</span>, s *goquery.Selection)</span></span> {
                categories = <span class="hljs-built_in">append</span>(categories, s.Text())
        })
        <span class="hljs-keyword">return</span> categories
}

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">parseTags</span><span class="hljs-params">(doc *goquery.Document)</span> []<span class="hljs-title">string</span></span> {
        <span class="hljs-keyword">var</span> tags []<span class="hljs-keyword">string</span>
        doc.Find(<span class="hljs-string">"div.product_meta &gt; span.tagged_as &gt; a"</span>).Each(<span class="hljs-function"><span class="hljs-keyword">func</span><span class="hljs-params">(i <span class="hljs-keyword">int</span>, s *goquery.Selection)</span></span> {
                tags = <span class="hljs-built_in">append</span>(tags, s.Text())
        })
        <span class="hljs-keyword">return</span> tags
}
</code></pre>
<p>We need the <code>goquery</code> library so let's also get it</p>
<pre><code class="lang-bash">go get github.com/PuerkitoBio/goquery
</code></pre>
<p>make sure you also import at the top of the file the import for goquery</p>
<pre><code class="lang-go"> <span class="hljs-keyword">import</span> <span class="hljs-string">"github.com/PuerkitoBio/goquery"</span>
</code></pre>
<p>Each of the above functions accepts a <code>*goquery.Document</code> as an input, extracts from the document the data we are interested using goquery in and returns them.</p>
<p>We wrote some code but it's not tested yet. Let's write some tests then.</p>
<p>Before writing the tests let's download the HTML of a product page in our test data directory:</p>
<pre><code class="lang-bash">curl -o testdata/sample-product.html <span class="hljs-string">'https://scrapeme.live/shop/Charmeleon/'</span>
</code></pre>
<p>The command above saves the HTML for the Charmeleon product in the file testdata/sample-product.html.</p>
<p>Let's now add some tests.</p>
<p>For unit tests we use the <code>testify</code> library, so get it via <code>go get</code></p>
<pre><code class="lang-bash">go get github.com/stretchr/testify/require
</code></pre>
<p>In <code>scrapemelive/product_test.go</code> add the following:</p>
<pre><code class="lang-go"><span class="hljs-keyword">package</span> scrapemelive

<span class="hljs-keyword">import</span> (
        <span class="hljs-string">"os"</span>
        <span class="hljs-string">"testing"</span>

        <span class="hljs-string">"github.com/PuerkitoBio/goquery"</span>
        <span class="hljs-string">"github.com/stretchr/testify/require"</span>
)

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">openTestFile</span><span class="hljs-params">(t *testing.T, filename <span class="hljs-keyword">string</span>)</span> *<span class="hljs-title">goquery</span>.<span class="hljs-title">Document</span></span> {
        t.Helper()
        file, err := os.Open(filename)
        require.NoError(t, err)
        <span class="hljs-keyword">defer</span> file.Close()
        doc, err := goquery.NewDocumentFromReader(file)
        require.NoError(t, err)
        <span class="hljs-keyword">return</span> doc
}

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">Test_parseTitle</span><span class="hljs-params">(t *testing.T)</span></span> {
        t.Parallel()
        doc := openTestFile(t, <span class="hljs-string">"../testdata/sample-product.html"</span>)
        require.Equal(t, <span class="hljs-string">"Charmeleon"</span>, parseTitle(doc))
}

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">Test_parsePrice</span><span class="hljs-params">(t *testing.T)</span></span> {
        t.Parallel()
        doc := openTestFile(t, <span class="hljs-string">"../testdata/sample-product.html"</span>)
        require.Equal(t, <span class="hljs-string">"£165.00"</span>, parsePrice(doc))
}

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">Test_parseShortDescription</span><span class="hljs-params">(t *testing.T)</span></span> {
        t.Parallel()
        doc := openTestFile(t, <span class="hljs-string">"../testdata/sample-product.html"</span>)
        require.Equal(t, <span class="hljs-string">"Charmeleon mercilessly destroys its foes using its sharp claws. If it encounters a strong foe, it turns aggressive. In this excited state, the flame at the tip of its tail flares with a bluish white color."</span>, parseShortDescription(doc))
}

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">Test_parseSku</span><span class="hljs-params">(t *testing.T)</span></span> {
        t.Parallel()
        doc := openTestFile(t, <span class="hljs-string">"../testdata/sample-product.html"</span>)
        require.Equal(t, <span class="hljs-string">"6565"</span>, parseSku(doc))
}

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">Test_parseCategories</span><span class="hljs-params">(t *testing.T)</span></span> {
        t.Parallel()
        doc := openTestFile(t, <span class="hljs-string">"../testdata/sample-product.html"</span>)
        require.ElementsMatch(t, []<span class="hljs-keyword">string</span>{<span class="hljs-string">"Pokemon"</span>, <span class="hljs-string">"Flame"</span>}, parseCategories(doc))
}

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">Test_parseTags</span><span class="hljs-params">(t *testing.T)</span></span> {
        t.Parallel()
        doc := openTestFile(t, <span class="hljs-string">"../testdata/sample-product.html"</span>)
        require.ElementsMatch(t, []<span class="hljs-keyword">string</span>{<span class="hljs-string">"Blaze"</span>, <span class="hljs-string">"charmeleon"</span>, <span class="hljs-string">"Flame"</span>}, parseTags(doc))
</code></pre>
<p>Let's run our tests:</p>
<pre><code class="lang-bash">go <span class="hljs-built_in">test</span> -v ./...
</code></pre>
<p>All tests must pass like:</p>
<pre><code class="lang-bash">?       scrapemelive    [no <span class="hljs-built_in">test</span> files]
=== RUN   Test_parseTitle
=== PAUSE Test_parseTitle
=== RUN   Test_parsePrice
=== PAUSE Test_parsePrice
=== RUN   Test_parseShortDescription
=== PAUSE Test_parseShortDescription
=== RUN   Test_parseSku
=== PAUSE Test_parseSku
=== RUN   Test_parseCategories
=== PAUSE Test_parseCategories
=== RUN   Test_parseTags
=== PAUSE Test_parseTags
=== CONT  Test_parseTitle
=== CONT  Test_parseSku
=== CONT  Test_parseShortDescription
=== CONT  Test_parsePrice
=== CONT  Test_parseTags
=== CONT  Test_parseCategories
--- PASS: Test_parseShortDescription (0.00s)
--- PASS: Test_parseTitle (0.00s)
--- PASS: Test_parseSku (0.00s)
--- PASS: Test_parsePrice (0.00s)
--- PASS: Test_parseCategories (0.00s)
--- PASS: Test_parseTags (0.00s)
PASS
ok      scrapemelive/scrapemelive       0.005s
</code></pre>
<p>This verifies that our parsing functions work as we expect.</p>
<p>The next step is to add a function <code>product.go</code> that uses all these functions and returns a <code>Product</code> .</p>
<p>In <code>scrapemelive/product.go</code> add the following function:</p>
<pre><code class="lang-go"><span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">parseProduct</span><span class="hljs-params">(doc *goquery.Document)</span> <span class="hljs-title">Product</span></span> {
        <span class="hljs-keyword">return</span> Product{
                Title:            parseTitle(doc),
                Price:            parsePrice(doc),
                ShortDescription: parseShortDescription(doc),
                Sku:              parseSku(doc),
                Categories:       parseCategories(doc),
                Tags:             parseTags(doc),
        }
}
</code></pre>
<p>Don't forget to add a unit test in <code>scrapemelive/product_test.go</code> also for this function</p>
<pre><code class="lang-go"><span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">Test_parseProduct</span><span class="hljs-params">(t *testing.T)</span></span> {
        t.Parallel()
        doc := openTestFile(t, <span class="hljs-string">"../testdata/sample-product.html"</span>)
        product := parseProduct(doc)
        require.Equal(t, <span class="hljs-string">"Charmeleon"</span>, product.Title)
        require.Equal(t, <span class="hljs-string">"£165.00"</span>, product.Price)
        require.Equal(t, <span class="hljs-string">"Charmeleon mercilessly destroys its foes using its sharp claws. If it encounters a strong foe, it turns aggressive. In this excited state, the flame at the tip of its tail flares with a bluish white color."</span>, product.ShortDescription)
        require.Equal(t, <span class="hljs-string">"6565"</span>, product.Sku)
        require.ElementsMatch(t, []<span class="hljs-keyword">string</span>{<span class="hljs-string">"Pokemon"</span>, <span class="hljs-string">"Flame"</span>}, product.Categories)
        require.ElementsMatch(t, []<span class="hljs-keyword">string</span>{<span class="hljs-string">"Blaze"</span>, <span class="hljs-string">"charmeleon"</span>, <span class="hljs-string">"Flame"</span>}, product.Tags)
}
</code></pre>
<p>Make sure your tests still pass by running <code>go test -v ./...</code> .</p>
<p>(Scrapemate)[<a target="_blank" href="https://github.com/gosom/scrapemate">https://github.com/gosom/scrapemate</a>] accepts jobs that implement a specific interface <code>scrapemate.IJob</code> . Luckily, there is an implementation of that interface that we can use as a base (<code>scrapemate.Job</code>) .</p>
<p>We previously identified that we have two types of pages:</p>
<ul>
<li><p>the listing pages, which contain multiple products and pagination. An example of a listing page is: <a target="_blank" href="https://scrapeme.live/shop/">https://scrapeme.live/shop/</a></p>
</li>
<li><p>the product or detail page, which contains the data for a specific product. An example of a detail page is: <a target="_blank" href="https://scrapeme.live/shop/Charmeleon/">https://scrapeme.live/shop/Charmeleon/</a></p>
</li>
</ul>
<p><code>IJob</code> interface has a method with the following signature:</p>
<pre><code class="lang-go">Process(ctx context.Context, resp scrapemate.Response) (any, []scrapemate.IJob, error)
</code></pre>
<p>It accepts:</p>
<ul>
<li><p>context.Context: which is a context object</p>
</li>
<li><p><code>scrapemate.Response</code>: the Response object that contains the <code>goquery.Document</code> and other fields.</p>
</li>
</ul>
<p>it returns:</p>
<ul>
<li><p>any: this is the result of the scraper</p>
</li>
<li><p>[]scrapemate.IJob: when we parse a web page we may need to instruct the scraper to visit other pages that we have discovered. So we return the next pages to visit there.</p>
</li>
<li><p>error: returns an error if there is one</p>
</li>
</ul>
<p>Based on the above and in particular to the <code>[]scrapemate.IJob</code> explanation we understand that we need two types of jobs.</p>
<ul>
<li><p>ProductCollectJob: for the listing pages</p>
</li>
<li><p>ProductJob: for the product detail pages</p>
</li>
</ul>
<p>The ProductCollectJob will extract the next page link and create another ProductCollectJob and will extract the product details and other ProductJob.</p>
<p>The ProductJob is responsible to return a Product and it does not have other jobs to return.</p>
<p>Install some dependencies:</p>
<pre><code class="lang-bash">go get github.com/gosom/scrapemate
go get github.com/gosom/kit/logging
go get github.com/google/uuid
</code></pre>
<h3 id="heading-productjob">ProductJob</h3>
<p>Let's create first the ProductJob by creating a file <code>scrapemelive/detail.go</code> .</p>
<pre><code class="lang-go"><span class="hljs-keyword">package</span> scrapemelive

<span class="hljs-keyword">import</span> (
        <span class="hljs-string">"context"</span>
        <span class="hljs-string">"errors"</span>

        <span class="hljs-string">"github.com/PuerkitoBio/goquery"</span>
        <span class="hljs-string">"github.com/gosom/kit/logging"</span>
        <span class="hljs-string">"github.com/gosom/scrapemate"</span>
)

<span class="hljs-keyword">type</span> ProductJob <span class="hljs-keyword">struct</span> {
        scrapemate.Job
}

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-params">(o *ProductJob)</span> <span class="hljs-title">Process</span><span class="hljs-params">(ctx context.Context, resp *scrapemate.Response)</span> <span class="hljs-params">(any, []scrapemate.IJob, error)</span></span> {
        log := ctx.Value(<span class="hljs-string">"log"</span>).(logging.Logger)
        log.Info(<span class="hljs-string">"processing product job"</span>)
        doc, ok := resp.Document.(*goquery.Document)
        <span class="hljs-keyword">if</span> !ok {
                <span class="hljs-keyword">return</span> <span class="hljs-literal">nil</span>, <span class="hljs-literal">nil</span>, errors.New(<span class="hljs-string">"failed to convert response to goquery document"</span>)
        }
        product := parseProduct(doc)
        <span class="hljs-keyword">return</span> product, <span class="hljs-literal">nil</span>, <span class="hljs-literal">nil</span>
}
</code></pre>
<p>Let's explain a bit about what the above is doing:</p>
<p>We create a new struct called <code>ProductJob</code> in which we embed the <code>scrapemate.Job</code> and we implement the method <code>Process</code> to accommodate our needs.</p>
<p>The <code>Process</code> method for that job has to parse the document fetched and extract the data we need and create an instance of <code>Product</code>. We then need to return that newly scraped Product.</p>
<p>We explain a bit more almost line by line:</p>
<p>Scrapemate offers you a logger and you can get it from context via</p>
<pre><code class="lang-go">ctx.Value(<span class="hljs-string">"log"</span>).(logging.Logger)
</code></pre>
<p>In order to fetch the document we need to do:</p>
<pre><code class="lang-go">doc, ok := resp.Document.(*goquery.Document)
</code></pre>
<p>Notice the typecasting, this is because <code>scrapemate</code> gives you the capability to configure the type of document parser you want to use. We will see that when we initialize the framework.</p>
<p>Once we have a document we can extract the information we need and create a product.</p>
<pre><code class="lang-go">product := parseProduct(doc)
</code></pre>
<p>Notice, that this is the function we created before, nothing more to explain here.</p>
<p>Finally, we return</p>
<pre><code class="lang-go"><span class="hljs-keyword">return</span> product, <span class="hljs-literal">nil</span>, <span class="hljs-literal">nil</span>
</code></pre>
<p>The order of return is :</p>
<ul>
<li><p>first: the data we parsed, here the <code>product</code></p>
</li>
<li><p>second: the next jobs that the scraper should process - here nothing</p>
</li>
<li><p>third: an error if there is any - here no error</p>
</li>
</ul>
<h3 id="heading-productcollectjob">ProductCollectJob</h3>
<p>The ProductCollectJob will extract the required links (for products and for the next page) and will return new jobs.</p>
<p>create the file <code>scrapemelive/collect.go</code> and add the following contents</p>
<pre><code class="lang-go"><span class="hljs-keyword">package</span> scrapemelive

<span class="hljs-keyword">import</span> (
    <span class="hljs-string">"context"</span>
    <span class="hljs-string">"errors"</span>
    <span class="hljs-string">"time"</span>

    <span class="hljs-string">"github.com/PuerkitoBio/goquery"</span>
    <span class="hljs-string">"github.com/google/uuid"</span>
    <span class="hljs-string">"github.com/gosom/kit/logging"</span>
    <span class="hljs-string">"github.com/gosom/scrapemate"</span>
)

<span class="hljs-keyword">type</span> ProductCollectJob <span class="hljs-keyword">struct</span> {
    scrapemate.Job
}

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-params">(o *ProductCollectJob)</span> <span class="hljs-title">Process</span><span class="hljs-params">(ctx context.Context, resp *scrapemate.Response)</span> <span class="hljs-params">(any, []scrapemate.IJob, error)</span></span> {
    log := ctx.Value(<span class="hljs-string">"log"</span>).(logging.Logger)
    log.Info(<span class="hljs-string">"processing collect job"</span>)
    doc, ok := resp.Document.(*goquery.Document)
    <span class="hljs-keyword">if</span> !ok {
        <span class="hljs-keyword">return</span> <span class="hljs-literal">nil</span>, <span class="hljs-literal">nil</span>, errors.New(<span class="hljs-string">"failed to convert response to goquery document"</span>)
    }
    <span class="hljs-keyword">var</span> nextJobs []scrapemate.IJob
    links := parseProductLinks(doc)
    <span class="hljs-keyword">for</span> _, link := <span class="hljs-keyword">range</span> links {
        nextJobs = <span class="hljs-built_in">append</span>(nextJobs, &amp;ProductJob{
            Job: scrapemate.Job{
                ID:     uuid.New().String(),
                Method: <span class="hljs-string">"GET"</span>,
                URL:    link,
                Headers: <span class="hljs-keyword">map</span>[<span class="hljs-keyword">string</span>]<span class="hljs-keyword">string</span>{
                    <span class="hljs-string">"User-Agent"</span>: scrapemate.DefaultUserAgent,
                },
                Timeout:    <span class="hljs-number">10</span> * time.Second,
                MaxRetries: <span class="hljs-number">3</span>,
                Priority:   <span class="hljs-number">0</span>,
            },
        })
    }
    nextPage := parseNextPage(doc)
    <span class="hljs-keyword">if</span> nextPage != <span class="hljs-string">""</span> {
        nextJobs = <span class="hljs-built_in">append</span>(nextJobs, &amp;ProductCollectJob{
            Job: scrapemate.Job{
                ID:     uuid.New().String(),
                Method: <span class="hljs-string">"GET"</span>,
                URL:    nextPage,
                Headers: <span class="hljs-keyword">map</span>[<span class="hljs-keyword">string</span>]<span class="hljs-keyword">string</span>{
                    <span class="hljs-string">"User-Agent"</span>: scrapemate.DefaultUserAgent,
                },
                Timeout:    <span class="hljs-number">10</span> * time.Second,
                MaxRetries: <span class="hljs-number">3</span>,
                Priority:   <span class="hljs-number">1</span>,
            },
        })
    }

    <span class="hljs-keyword">return</span> <span class="hljs-literal">nil</span>, nextJobs, <span class="hljs-literal">nil</span>
}

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">parseProductLinks</span><span class="hljs-params">(doc *goquery.Document)</span> []<span class="hljs-title">string</span></span> {
    <span class="hljs-keyword">var</span> links []<span class="hljs-keyword">string</span>
    doc.Find(<span class="hljs-string">"a.woocommerce-LoopProduct-link"</span>).Each(<span class="hljs-function"><span class="hljs-keyword">func</span><span class="hljs-params">(i <span class="hljs-keyword">int</span>, s *goquery.Selection)</span></span> {
        link, _ := s.Attr(<span class="hljs-string">"href"</span>)
        links = <span class="hljs-built_in">append</span>(links, link)
    })
    <span class="hljs-keyword">return</span> links
}

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">parseNextPage</span><span class="hljs-params">(doc *goquery.Document)</span> <span class="hljs-title">string</span></span> {
    <span class="hljs-keyword">return</span> doc.Find(<span class="hljs-string">"a.next.page-numbers"</span>).AttrOr(<span class="hljs-string">"href"</span>, <span class="hljs-string">""</span>)
}
</code></pre>
<p>Notice here that we have at the end of the file two functions</p>
<ul>
<li><p><code>parseProductLinks</code></p>
</li>
<li><p><code>parseNextPage</code></p>
</li>
</ul>
<p>They do what the name implies.<br />First, it extracts all the links for the products and returns a slice of strings that contains the links.</p>
<p>The second function extracts the links that take us to the next page in the pagination.</p>
<p>Let's see in more detail the <code>Process</code> method</p>
<pre><code class="lang-go"><span class="hljs-keyword">var</span> nextJobs []scrapemate.IJob
    links := parseProductLinks(doc)
    <span class="hljs-keyword">for</span> _, link := <span class="hljs-keyword">range</span> links {
        nextJobs = <span class="hljs-built_in">append</span>(nextJobs, &amp;ProductJob{
            Job: scrapemate.Job{
                ID:     uuid.New().String(),
                Method: <span class="hljs-string">"GET"</span>,
                URL:    link,
                Headers: <span class="hljs-keyword">map</span>[<span class="hljs-keyword">string</span>]<span class="hljs-keyword">string</span>{
                    <span class="hljs-string">"User-Agent"</span>: scrapemate.DefaultUserAgent,
                },
                Timeout:    <span class="hljs-number">10</span> * time.Second,
                MaxRetries: <span class="hljs-number">3</span>,
                Priority:   <span class="hljs-number">0</span>,
            },
        })
    }
</code></pre>
<p>The portion of the code above is responsible to :</p>
<ul>
<li><p>parse the links out of the webpage</p>
</li>
<li><p>create a ProductJob for each link and append it to <code>nextJobs</code> slice</p>
</li>
</ul>
<pre><code class="lang-go">    nextPage := parseNextPage(doc)
    <span class="hljs-keyword">if</span> nextPage != <span class="hljs-string">""</span> {
        nextJobs = <span class="hljs-built_in">append</span>(nextJobs, &amp;ProductCollectJob{
            Job: scrapemate.Job{
                ID:     uuid.New().String(),
                Method: <span class="hljs-string">"GET"</span>,
                URL:    nextPage,
                Headers: <span class="hljs-keyword">map</span>[<span class="hljs-keyword">string</span>]<span class="hljs-keyword">string</span>{
                    <span class="hljs-string">"User-Agent"</span>: scrapemate.DefaultUserAgent,
                },
                Timeout:    <span class="hljs-number">10</span> * time.Second,
                MaxRetries: <span class="hljs-number">3</span>,
                Priority:   <span class="hljs-number">1</span>,
            },
</code></pre>
<p>What we do here is that we first parse the next link.</p>
<p>Afterward, if the next link is not empty we create a <code>ProductCollectJob</code> and we append it to the <code>nextJobs</code> slice.</p>
<p>Let's write some tests for the <code>parseProductLinks</code> and <code>parseNextPage</code> functions.</p>
<p>Fetch a listing page and store its HTML to <code>testdata</code> folder</p>
<pre><code class="lang-bash">curl -o testdata/sample-category.html <span class="hljs-string">'https://scrapeme.live/shop/'</span>
</code></pre>
<p>Create a file <code>scrapemelive/collect_test.go</code> and add:</p>
<pre><code class="lang-go"><span class="hljs-keyword">package</span> scrapemelive

<span class="hljs-keyword">import</span> (
        <span class="hljs-string">"testing"</span>

        <span class="hljs-string">"github.com/stretchr/testify/require"</span>
)

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">Test_parseProductLinks</span><span class="hljs-params">(t *testing.T)</span></span> {
        t.Parallel()
        doc := openTestFile(t, <span class="hljs-string">"../testdata/sample-category.html"</span>)
        links := parseProductLinks(doc)
        require.Len(t, links, <span class="hljs-number">16</span>)
}

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">Test_parseNextPage</span><span class="hljs-params">(t *testing.T)</span></span> {
        t.Parallel()
        doc := openTestFile(t, <span class="hljs-string">"../testdata/sample-category.html"</span>)
        nextPage := parseNextPage(doc)
        require.Equal(t, <span class="hljs-string">"https://scrapeme.live/shop/page/2/"</span>, nextPage)
}
</code></pre>
<p>Make sure tests pass:</p>
<pre><code class="lang-bash">go <span class="hljs-built_in">test</span> -v ./...
</code></pre>
<h3 id="heading-main-function">Main function</h3>
<p>Now it's time to write our <code>main.go</code> function.</p>
<p>Instal another dependency</p>
<pre><code class="lang-bash">go get github.com/gosom/scrapemate/adapters/cache/leveldbcache
</code></pre>
<p>Now in your main.go add the following</p>
<pre><code class="lang-go"><span class="hljs-keyword">package</span> main

<span class="hljs-keyword">import</span> (
    <span class="hljs-string">"context"</span>
    <span class="hljs-string">"errors"</span>
    <span class="hljs-string">"net/http"</span>
    <span class="hljs-string">"os"</span>
    <span class="hljs-string">"time"</span>

    <span class="hljs-string">"github.com/google/uuid"</span>
    <span class="hljs-string">"github.com/gosom/scrapemate"</span>
    <span class="hljs-string">"github.com/gosom/scrapemate/adapters/cache/leveldbcache"</span>
    fetcher <span class="hljs-string">"github.com/gosom/scrapemate/adapters/fetchers/nethttp"</span>
    parser <span class="hljs-string">"github.com/gosom/scrapemate/adapters/parsers/goqueryparser"</span>
    provider <span class="hljs-string">"github.com/gosom/scrapemate/adapters/providers/memory"</span>

    <span class="hljs-string">"scrapemelive/scrapemelive"</span>
)

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">main</span><span class="hljs-params">()</span></span> {
    err := run()
    <span class="hljs-keyword">if</span> err == <span class="hljs-literal">nil</span> || errors.Is(err, scrapemate.ErrorExitSignal) {
        os.Exit(<span class="hljs-number">0</span>)
        <span class="hljs-keyword">return</span>
    }
    os.Exit(<span class="hljs-number">1</span>)
}

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">run</span><span class="hljs-params">()</span> <span class="hljs-title">error</span></span> {
    ctx, cancel := context.WithCancelCause(context.Background())
    <span class="hljs-keyword">defer</span> cancel(errors.New(<span class="hljs-string">"deferred cancel"</span>))

    provider := provider.New()

    <span class="hljs-keyword">go</span> <span class="hljs-function"><span class="hljs-keyword">func</span><span class="hljs-params">()</span></span> {
        job := &amp;scrapemelive.ProductCollectJob{
            Job: scrapemate.Job{
                ID:     uuid.New().String(),
                Method: http.MethodGet,
                URL:    <span class="hljs-string">"https://scrapeme.live/shop/"</span>,
                Headers: <span class="hljs-keyword">map</span>[<span class="hljs-keyword">string</span>]<span class="hljs-keyword">string</span>{
                    <span class="hljs-string">"User-Agent"</span>: scrapemate.DefaultUserAgent,
                },
                Timeout:    <span class="hljs-number">10</span> * time.Second,
                MaxRetries: <span class="hljs-number">3</span>,
            },
        }
        provider.Push(ctx, job)
    }()

    httpFetcher := fetcher.New(&amp;http.Client{
        Timeout: <span class="hljs-number">10</span> * time.Second,
    })

    cacher, err := leveldbcache.NewLevelDBCache(<span class="hljs-string">"__leveldb_cache"</span>)
    <span class="hljs-keyword">if</span> err != <span class="hljs-literal">nil</span> {
        <span class="hljs-keyword">return</span> err
    }

    mate, err := scrapemate.New(
        scrapemate.WithContext(ctx, cancel),
        scrapemate.WithJobProvider(provider),
        scrapemate.WithHTTPFetcher(httpFetcher),
        scrapemate.WithConcurrency(<span class="hljs-number">10</span>),
        scrapemate.WithHTMLParser(parser.New()),
        scrapemate.WithCache(cacher),
    )

    <span class="hljs-keyword">if</span> err != <span class="hljs-literal">nil</span> {
        <span class="hljs-keyword">return</span> err
    }

    resultsDone := <span class="hljs-built_in">make</span>(<span class="hljs-keyword">chan</span> <span class="hljs-keyword">struct</span>{})
    <span class="hljs-keyword">go</span> <span class="hljs-function"><span class="hljs-keyword">func</span><span class="hljs-params">()</span></span> {
        <span class="hljs-keyword">defer</span> <span class="hljs-built_in">close</span>(resultsDone)
        <span class="hljs-keyword">if</span> err := writeCsv(mate.Results()); err != <span class="hljs-literal">nil</span> {
            cancel(err)
            <span class="hljs-keyword">return</span>
        }
    }()

    err = mate.Start()
    &lt;-resultsDone
    <span class="hljs-keyword">return</span> err
}

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">writeCsv</span><span class="hljs-params">(results &lt;-<span class="hljs-keyword">chan</span> scrapemate.Result)</span> <span class="hljs-title">error</span></span> {
    <span class="hljs-comment">// TODO</span>
    <span class="hljs-keyword">return</span> <span class="hljs-literal">nil</span>
}
</code></pre>
<p>Let's go over it slowly:</p>
<p>Initially, the function <code>run</code> is called and it returns an error.</p>
<p><code>scrapemate</code> returns a special error <code>ErrorExitSignal</code> when the program exits because it captured a SIGINT. In such a case we want it to exist with status code 0.</p>
<p>In the other cases, we return exit code 1.</p>
<p>Now, the <code>run</code> function. A lot is happening here</p>
<p><code>Scrapemate</code> requires as to first declare a job provider:</p>
<pre><code class="lang-go">provider := provider.New()

    <span class="hljs-keyword">go</span> <span class="hljs-function"><span class="hljs-keyword">func</span><span class="hljs-params">()</span></span> {
        job := &amp;scrapemelive.ProductCollectJob{
            Job: scrapemate.Job{
                ID:     uuid.New().String(),
                Method: http.MethodGet,
                URL:    <span class="hljs-string">"https://scrapeme.live/shop/"</span>,
                Headers: <span class="hljs-keyword">map</span>[<span class="hljs-keyword">string</span>]<span class="hljs-keyword">string</span>{
                    <span class="hljs-string">"User-Agent"</span>: scrapemate.DefaultUserAgent,
                },
                Timeout:    <span class="hljs-number">10</span> * time.Second,
                MaxRetries: <span class="hljs-number">3</span>,
            },
        }
        provider.Push(ctx, job)
    }()
</code></pre>
<p>A provider is a data structure that provides jobs to the scraper. Here we want to start our crawler with the homepage of the e-shop. We create the initial job and we push to the provider.</p>
<p>We also have to define how we are going to fetch webpages. For this purpose, we need an instance of a <code>fetcher</code></p>
<pre><code class="lang-go">httpFetcher := fetcher.New(&amp;http.Client{
        Timeout: <span class="hljs-number">10</span> * time.Second,
    })
</code></pre>
<p>We also want to cache the responses so we initialize an instance of a <code>Cacher</code>.</p>
<pre><code class="lang-go">cacher, err := leveldbcache.NewLevelDBCache(<span class="hljs-string">"__leveldb_cache"</span>)
    <span class="hljs-keyword">if</span> err != <span class="hljs-literal">nil</span> {
        <span class="hljs-keyword">return</span> err
    }
</code></pre>
<p>Above, we are going to cache using <code>leveldb</code> and the database will be created in a folder named <code>__leveldb_cache</code> .</p>
<p>We can now initialize our scraper:</p>
<pre><code class="lang-go">mate, err := scrapemate.New(
        scrapemate.WithContext(ctx, cancel),
        scrapemate.WithJobProvider(provider),
        scrapemate.WithHTTPFetcher(httpFetcher),
        scrapemate.WithConcurrency(<span class="hljs-number">10</span>),
        scrapemate.WithHTMLParser(parser.New()),
        scrapemate.WithCache(cacher),
    )

    <span class="hljs-keyword">if</span> err != <span class="hljs-literal">nil</span> {
        <span class="hljs-keyword">return</span> err
    }
</code></pre>
<p>Notice the <code>WithConcurrency</code> , this configures the framework to use 10 parallel workers.</p>
<p>The WithHtml parser configures the html parser. We chose to use the default one, which uses <code>goquery</code>.</p>
<pre><code class="lang-go">scrapemate.WithHtmlParser(parser.New()),
</code></pre>
<p>Once <code>Scrapemate</code> finishes a job it pushes the result into a channel. To acces that channel we can use the <code>mate.Results()</code> method.</p>
<pre><code class="lang-go">resultsDone := <span class="hljs-built_in">make</span>(<span class="hljs-keyword">chan</span> <span class="hljs-keyword">struct</span>{})
    <span class="hljs-keyword">go</span> <span class="hljs-function"><span class="hljs-keyword">func</span><span class="hljs-params">()</span></span> {
        <span class="hljs-keyword">defer</span> <span class="hljs-built_in">close</span>(resultsDone)
        <span class="hljs-keyword">if</span> err := writeCsv(mate.Results()); err != <span class="hljs-literal">nil</span> {
            cancel(err)
            <span class="hljs-keyword">return</span>
        }
    }()
</code></pre>
<p>The above snippet starts a new goroutine that is responsible to write the results in a CSV file.</p>
<p>In the last part</p>
<pre><code class="lang-go">err = mate.Start()
&lt;-resultsDone
<span class="hljs-keyword">return</span> err
</code></pre>
<p>we just start the scraper and wait until all the results are written.</p>
<p>The scraper, even if it has no more jobs will still wait until you kill it via ctrl-c.</p>
<h3 id="heading-csv-writing">CSV writing</h3>
<p>In our task description, we want to create a CSV with headers:</p>
<pre><code class="lang-bash">title,price,short_description,sku,categories,tags
</code></pre>
<p>Open <code>scrapemelive.go</code> and add the following:</p>
<pre><code class="lang-go"><span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-params">(o Product)</span> <span class="hljs-title">CsvHeaders</span><span class="hljs-params">()</span> []<span class="hljs-title">string</span></span> {
    <span class="hljs-keyword">return</span> []<span class="hljs-keyword">string</span>{
        <span class="hljs-string">"title"</span>,
        <span class="hljs-string">"price"</span>,
        <span class="hljs-string">"short_description"</span>,
        <span class="hljs-string">"sku"</span>,
        <span class="hljs-string">"categories"</span>,
        <span class="hljs-string">"tags"</span>,
    }
}

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-params">(o Product)</span> <span class="hljs-title">CsvRow</span><span class="hljs-params">()</span> []<span class="hljs-title">string</span></span> {
    <span class="hljs-keyword">return</span> []<span class="hljs-keyword">string</span>{
        o.Title,
        o.Price,
        o.ShortDescription,
        o.Sku,
        strings.Join(o.Categories, <span class="hljs-string">","</span>),
        strings.Join(o.Tags, <span class="hljs-string">","</span>),
    }
}
</code></pre>
<p>Now open <code>main.go</code> and replace the function <code>writeCsv</code> with the following:</p>
<pre><code class="lang-go"><span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">writeCsv</span><span class="hljs-params">(results &lt;-<span class="hljs-keyword">chan</span> scrapemate.Result)</span> <span class="hljs-title">error</span></span> {
    w := csv.NewWriter(os.Stdout)
    <span class="hljs-keyword">defer</span> w.Flush()
    headersWritten := <span class="hljs-literal">false</span>
    <span class="hljs-keyword">for</span> result := <span class="hljs-keyword">range</span> results {
        <span class="hljs-keyword">if</span> result.Data == <span class="hljs-literal">nil</span> {
            <span class="hljs-keyword">continue</span>
        }
        product, ok := result.Data.(scrapemelive.Product)
        <span class="hljs-keyword">if</span> !ok {
            <span class="hljs-keyword">return</span> fmt.Errorf(<span class="hljs-string">"unexpected data type: %T"</span>, result.Data)
        }
        <span class="hljs-keyword">if</span> !headersWritten {
            <span class="hljs-keyword">if</span> err := w.Write(product.CsvHeaders()); err != <span class="hljs-literal">nil</span> {
                <span class="hljs-keyword">return</span> err
            }
            headersWritten = <span class="hljs-literal">true</span>
        }
        <span class="hljs-keyword">if</span> err := w.Write(product.CsvRow()); err != <span class="hljs-literal">nil</span> {
            <span class="hljs-keyword">return</span> err
        }
        w.Flush()
    }
    <span class="hljs-keyword">return</span> w.Error()
}
</code></pre>
<p>The code above just takes each result from the Results channel and writes it to the CSV file.</p>
<h2 id="heading-run-the-scraper">Run the scraper</h2>
<p>Now that all the code is in place we can run our scraper.</p>
<pre><code class="lang-bash">go run main.go 1&gt;pokemons.csv
</code></pre>
<p>This will take some time. Meanwhile, you will see the logs in your screen.</p>
<p>Once the logs stop updating wait a few seconds and hit CTRL-C.</p>
<pre><code class="lang-bash">{<span class="hljs-string">"level"</span>:<span class="hljs-string">"info"</span>,<span class="hljs-string">"component"</span>:<span class="hljs-string">"scrapemate"</span>,<span class="hljs-string">"job"</span>:<span class="hljs-string">"Job{ID: 4de24748-1e8e-4ab7-843f-8571ca8b2d49, Method: GET, URL: https://scrapeme.live/shop/Blacephalon/, UrlParams: map[]}"</span>,<span class="hljs-string">"status"</span>:<span class="hljs-string">"success"</span>,<span class="hljs-string">"duration"</span>:1323.897329,<span class="hljs-string">"time"</span>:<span class="hljs-string">"2023-04-14T07:41:41.911948749Z"</span>,<span class="hljs-string">"message"</span>:<span class="hljs-string">"job finished"</span>}
^C{<span class="hljs-string">"level"</span>:<span class="hljs-string">"info"</span>,<span class="hljs-string">"component"</span>:<span class="hljs-string">"scrapemate"</span>,<span class="hljs-string">"time"</span>:<span class="hljs-string">"2023-04-14T07:42:13.345446924Z"</span>,<span class="hljs-string">"message"</span>:<span class="hljs-string">"received signal, shutting down"</span>}
{<span class="hljs-string">"level"</span>:<span class="hljs-string">"info"</span>,<span class="hljs-string">"component"</span>:<span class="hljs-string">"scrapemate"</span>,<span class="hljs-string">"time"</span>:<span class="hljs-string">"2023-04-14T07:42:13.345757168Z"</span>,<span class="hljs-string">"message"</span>:<span class="hljs-string">"scrapemate exited"</span>}
</code></pre>
<p>You should see something like the above.</p>
<p>The results should be in <code>pokemons.csv</code> file.</p>
<h2 id="heading-summary">Summary</h2>
<p>In this tutorial, I have shown how to use Golang and Scrapemate to extract data from a website.</p>
<p>Specifically, I demonstrated how to scrape product information from the website <a target="_blank" href="https://scrapeme.live/shop/"><strong>https://scrapeme.live/shop/</strong></a> by extracting the title, price, short_description, sku, tags, and categories for each of the Pokemon products on the site.</p>
<p>I used Scrapemate, a Golang-based scraping framework, to perform the web scraping, and then wrote the scraped data to a CSV file.</p>
<p>This example illustrates how web scraping can be a powerful tool for collecting information for research, analysis, or automation, and how Scrapemate can simplify the process of building scraping tools in Golang.</p>
<p>You can find all the code above in <a target="_blank" href="https://github.com/gosom/scrapemate-example-scrapemelive">github</a></p>
]]></content:encoded></item><item><title><![CDATA[Managing Distributed Transactions in PostgreSQL and Golang using two phase commit]]></title><description><![CDATA[If you're building a distributed system with PostgreSQL as the database backend, you might have encountered issues with managing transactions across multiple nodes. When a transaction spans multiple databases, ensuring atomicity and consistency can b...]]></description><link>https://blog.gkomninos.com/managing-distributed-transactions-in-postgresql-and-golang-using-two-phase-commit</link><guid isPermaLink="true">https://blog.gkomninos.com/managing-distributed-transactions-in-postgresql-and-golang-using-two-phase-commit</guid><category><![CDATA[Go Language]]></category><category><![CDATA[PostgreSQL]]></category><category><![CDATA[transactions]]></category><category><![CDATA[atomicity]]></category><category><![CDATA[two-phase-commit]]></category><dc:creator><![CDATA[Georgios Komninos]]></dc:creator><pubDate>Sat, 25 Mar 2023 20:04:09 GMT</pubDate><content:encoded><![CDATA[<p>If you're building a distributed system with PostgreSQL as the database backend, you might have encountered issues with managing transactions across multiple nodes. When a transaction spans multiple databases, ensuring atomicity and consistency can be a challenge. That's where <a target="_blank" href="https://github.com/gosom/gosql2pc"><code>gosql2pc</code></a> comes in.</p>
<p><a target="_blank" href="https://github.com/gosom/gosql2pc"><code>gosql2pc</code></a> is a Golang library for implementing 2 phase commit transactions in PostgreSQL, ensuring atomicity and consistency across distributed systems. With <code>gosql2pc</code>, you can manage transactions across multiple databases, ensuring that all changes are committed or rolled back atomically.</p>
<h2 id="heading-getting-started">Getting Started</h2>
<p>To get started with <code>gosql2pc</code>, you need to have a working knowledge of Golang and PostgreSQL. You also need to have PostgreSQL installed, as <code>gosql2pc</code> relies on the PostgreSQL 2-phase commit protocol.</p>
<p>Once you have PostgreSQL installed, you can install <code>gosql2pc</code> using the following command:</p>
<pre><code class="lang-bash">go get github.com/gosom/gosql2pc
</code></pre>
<p>Then, use the library's API to create participants for your distributed transaction.</p>
<p>Here's an example of using gosql2pc to create a simple distributed transaction that inserts a new user and order into two separate databases:</p>
<pre><code class="lang-go"><span class="hljs-comment">// Create the participants for the 2 phase commit</span>
p1 := twophase.NewParticipant(db1, <span class="hljs-function"><span class="hljs-keyword">func</span><span class="hljs-params">(ctx context.Context, tx *sql.Tx)</span> <span class="hljs-title">error</span></span> {
    _, err := tx.ExecContext(ctx, <span class="hljs-string">"INSERT INTO users (id, name) VALUES ($1, $2)"</span>, userID, name)
    <span class="hljs-keyword">return</span> err
})

p2 := twophase.NewParticipant(db2, <span class="hljs-function"><span class="hljs-keyword">func</span><span class="hljs-params">(ctx context.Context, tx *sql.Tx)</span> <span class="hljs-title">error</span></span> {
    _, err := tx.ExecContext(ctx, <span class="hljs-string">"INSERT INTO orders (id, user_id, amount) VALUES ($1, $2, $3)"</span>, orderID, userID, amount)
    <span class="hljs-keyword">return</span> err
})

<span class="hljs-comment">// setup the parameters for the transaction</span>
params := twophase.Params{
    Participants: []gosql2pc.Participant{p1, p2},
}

<span class="hljs-comment">// run the transaction</span>
<span class="hljs-keyword">if</span> err := twophase.Do(context.Background(), params); err != <span class="hljs-literal">nil</span> {
    <span class="hljs-built_in">panic</span>(err)
}
</code></pre>
<p>You can find more examples in the library's <a target="_blank" href="https://github.com/gosom/gosql2pc/tree/main/example"><code>example</code></a> directory.</p>
<p><strong>Notes</strong></p>
<p>It's worth noting that distributed transactions can be tricky to manage, and gosql2pc is no exception. PostgreSQL has disabled prepared transactions by default for a good reason, and enabling them can lead to orphaned transactions and data inconsistencies if not monitored carefully.</p>
<p>Nevertheless, gosql2pc provides a useful tool for simplifying the implementation of distributed transactions. If you find a bug or want to suggest a new feature, contributions are always welcome.</p>
<p>To learn more about gosql2pc, including how to enable prepared transactions and monitor for orphaned transactions, check out the library's <a target="_blank" href="https://github.com/gosom/gosql2pc">README</a> file and the accompanying blog posts.</p>
]]></content:encoded></item><item><title><![CDATA[Introducing Address Parser Go REST: A Simple Solution for Address Parsing]]></title><description><![CDATA[Parsing addresses can be a complex and time-consuming task.
Different address formats and variations can make it challenging to extract accurate information from address strings.
Fortunately, Address Parser Go REST provides a simple solution to this ...]]></description><link>https://blog.gkomninos.com/introducing-address-parser-go-rest-a-simple-solution-for-address-parsing</link><guid isPermaLink="true">https://blog.gkomninos.com/introducing-address-parser-go-rest-a-simple-solution-for-address-parsing</guid><category><![CDATA[API development ]]></category><category><![CDATA[golang]]></category><category><![CDATA[REST API]]></category><category><![CDATA[address-parsing]]></category><category><![CDATA[libpostal]]></category><dc:creator><![CDATA[Georgios Komninos]]></dc:creator><pubDate>Thu, 09 Mar 2023 20:16:58 GMT</pubDate><content:encoded><![CDATA[<p>Parsing addresses can be a complex and time-consuming task.</p>
<p>Different address formats and variations can make it challenging to extract accurate information from address strings.</p>
<p>Fortunately, Address Parser Go REST provides a simple solution to this problem.</p>
<p>This REST API allows you to parse addresses into their individual components quickly and easily.</p>
<h2 id="heading-what-is-address-parser-go-rest">What is Address Parser Go REST?</h2>
<p>Address Parser Go REST is a REST API that uses the libpostal library to parse addresses into their individual components.</p>
<p>By submitting a request to the API with an address, you will receive a JSON response with the parsed components.</p>
<p><em>This API removes the need to include the libpostal library as a dependency in your project and allows you to parse addresses with ease.</em></p>
<h2 id="heading-how-does-address-parser-go-rest-work">How does Address Parser Go REST work?</h2>
<p>Address Parser Go REST uses the libpostal library to parse addresses.</p>
<p>The library is a powerful natural language processing library for addresses that can handle different address formats, variations, and misspellings, ensuring accurate parsing results.</p>
<p>The API is built with Go, a fast and efficient programming language, which allows for speedy processing of requests.</p>
<h3 id="heading-getting-started-with-address-parser-go-rest">Getting started with Address Parser Go REST</h3>
<pre><code class="lang-bash">docker run -p 8080:8080 gosom/address-parser-go-rest:v1.0.1
</code></pre>
<p>This command will run the Address Parser Go REST API container and map port 8080 of the container to port 8080 of your machine.</p>
<p>Now you can start parsing addresses! Here's an example of how to parse an address using the API:</p>
<pre><code class="lang-bash">curl --location --request POST <span class="hljs-string">'http://localhost:8080/parse'</span> --header <span class="hljs-string">'Content-Type: application/json'</span> --data-raw <span class="hljs-string">'{
    "address": "1600 Amphitheatre Parkway, Mountain View, CA 94043",
    "title_case": true
}'</span>
</code></pre>
<p>This command sends a POST request to the <code>/parse</code> endpoint of the API with a JSON payload containing an address string. The API will parse the address and return a JSON response with the individual components of the address:</p>
<pre><code class="lang-bash">{
  <span class="hljs-string">"house_number"</span>: <span class="hljs-string">"1600"</span>,
  <span class="hljs-string">"road"</span>: <span class="hljs-string">"Amphitheatre Parkway"</span>,
  <span class="hljs-string">"postcode"</span>: <span class="hljs-string">"94043"</span>,
  <span class="hljs-string">"city"</span>: <span class="hljs-string">"Mountain View"</span>,
  <span class="hljs-string">"state"</span>: <span class="hljs-string">"Ca"</span>
}
</code></pre>
<p>The <code>title_case</code> parameter is optional. Default behavior of libpostal is to return all addresses in lower case and if you omit that then you will have the same behavior.</p>
<p>That's it! You've successfully parsed an address using Address Parser Go REST.</p>
<p>You can see the available options and the responses in swagger documentation.</p>
<p>Access it by visiting https://localhost:8080/docs/ .</p>
<h3 id="heading-acknowledgements">Acknowledgements</h3>
<p>We would like to acknowledge the contributors of the <a target="_blank" href="https://github.com/openvenues/libpostal">libpostal</a> library and the <a target="_blank" href="https://github.com/openvenues/gopostal">Go bindings</a> used in this project.</p>
<p>We appreciate their hard work and dedication to creating these useful tools.</p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>Address Parser Go REST is a simple solution for parsing addresses that can save you time and effort.</p>
<p>With its powerful parsing capabilities and efficient processing, this REST API is a useful tool for any project that requires address parsing.</p>
<p>If you encounter any issues or have feedback to provide, please create a new issue on the <a target="_blank" href="https://github.com/gosom/address-parser-go-rest">GitHub</a> repository.</p>
]]></content:encoded></item><item><title><![CDATA[FOSDEM 2023: My Top Highlights from the Open-Source Conference]]></title><description><![CDATA[Introduction
FOSDEM 2023 (Free and Open Source Software Developers' European Meeting) is an annual event that takes place in Brussels, Belgium. The conference brings together a diverse range of open-source enthusiasts, developers, and community membe...]]></description><link>https://blog.gkomninos.com/fosdem-2023-my-top-highlights-from-the-open-source-conference</link><guid isPermaLink="true">https://blog.gkomninos.com/fosdem-2023-my-top-highlights-from-the-open-source-conference</guid><category><![CDATA[Go Language]]></category><category><![CDATA[conference]]></category><category><![CDATA[Open Source]]></category><category><![CDATA[Programming Blogs]]></category><category><![CDATA[fosdem]]></category><dc:creator><![CDATA[Georgios Komninos]]></dc:creator><pubDate>Sat, 11 Feb 2023 09:56:10 GMT</pubDate><content:encoded><![CDATA[<h2 id="heading-introduction">Introduction</h2>
<p><a target="_blank" href="https://fosdem.org/2023/">FOSDEM 2023</a> (Free and Open Source Software Developers' European Meeting) is an annual event that takes place in Brussels, Belgium. The conference brings together a diverse range of open-source enthusiasts, developers, and community members from around the world to exchange ideas, share knowledge, and collaborate on the latest open-source technologies and projects. FOSDEM offers a wide range of sessions, talks, and workshops that cover a variety of topics, including programming, system administration, and security. With a strong focus on community building and collaboration, FOSDEM is a must-attend event for anyone interested in the open-source world.</p>
<h2 id="heading-preparation">Preparation</h2>
<p>Preparing for the FOSDEM conference involved researching the conference schedule and speakers, and booking travel and accommodation. Since this was my fourth time there I knew what to expect and there were no surprises. This year I joined the event with two friends. We started the conference on Friday with a few beers :) .</p>
<p>Selecting which talks to attend at the FOSDEM conference can be a challenging task, as there are often a large number of interesting sessions and workshops to choose from. With a wide variety of topics covering everything from programming and system administration to security and community building, it can be difficult to decide which sessions will provide the most value. It's important to consider one's professional interests and goals when selecting talks, but it's also helpful to step outside of one's comfort zone and attend sessions that cover new and unfamiliar topics.</p>
<h2 id="heading-my-schedule">My Schedule</h2>
<p>This year I decided to attend only two Dev rooms the one focused on <a target="_blank" href="https://go.dev/">Go</a> and the one focused on <a target="_blank" href="https://www.postgresql.org/"><strong>PostgreSQL</strong></a><strong>.</strong> Go talks were on Saturday and PostgreSQL talks on Sunday.</p>
<h3 id="heading-saturday">Saturday</h3>
<ul>
<li><p><a target="_blank" href="https://fosdem.org/2023/schedule/event/gostateofgo/">The State of Go</a>: This is a talk about the changes/new features in the latest go version. If you are a Go developer I <em>highly recommend</em> watching it or reading the slides</p>
</li>
<li><p><a target="_blank" href="https://fosdem.org/2023/schedule/event/goreducecognitive/">Recipes</a> for reducing cognitive load: Interesting talk about how you can write code that is easier to understand. <a target="_blank" href="https://github.com/fedepaol">Federico</a> gave some nice tips and examples.<br />  <em>Recommend</em> watching</p>
</li>
<li><p><a target="_blank" href="https://fosdem.org/2023/schedule/event/goreconciliation/">Reconciliation Pattern, Control Theory and Cluster API</a>: A talk about the reconciliation pattern that is used in Kubernetes to ensure that the current state is the desired state.</p>
</li>
<li><p><a target="_blank" href="https://fosdem.org/2023/schedule/event/gofivestepsefficient/">Five Steps to Make Your Go Code Faster &amp; More Efficient</a>: Nice talk from <a target="_blank" href="https://bwplotka.dev/book">Bartłomiej</a>. It explains a methodology of how you can optimize your code.<br />  <em>Recommend</em> watching</p>
</li>
<li><p><a target="_blank" href="https://fosdem.org/2023/schedule/event/goheadscale/">Headscale: How we are using integration testing to reimplement Tailscale</a>: Interesting talk about Kristofer's and Juan's journey implementing <a target="_blank" href="https://github.com/juanfont/headscale">headscale</a> .<br />  <em>Recommend</em> watching</p>
</li>
<li><p><a target="_blank" href="https://fosdem.org/2023/schedule/event/gobuildingdatabase/">Our Mad Journey of Building a Vector Database in Go</a>: A talk about <a target="_blank" href="https://weaviate.io/">https://weaviate.io/</a> . This is a vector search database built in Go. The talk touches Go internals and I found it very interesting.<br />  <em>Recommend</em> watching</p>
</li>
<li><p><a target="_blank" href="https://fosdem.org/2023/schedule/event/gowatermill/">Building a basic event-driven application in Go in 20 minutes</a>: Talk about the <a target="_blank" href="https://watermill.io">watermill</a> library. A nice talk which quickly demonstrates the main feature of the library</p>
</li>
</ul>
<p>I learned something from all of the talks and if you have the time, watch all of them. But if I had to suggest only 2 then my picks are The State of Go, Our Mad Journey of Building a Vector Database in Go.</p>
<h3 id="heading-sunday">Sunday</h3>
<ul>
<li><p><a target="_blank" href="https://fosdem.org/2023/schedule/event/postgresql_tour_de_data_types_varchar2_or_char_255/">Tour de Data Types: VARCHAR2 or CHAR(255)?</a>: Very nice talk about data types in Postgres with nice tips. <a target="_blank" href="https://andreas.scherbaum.la/blog/">Andreas</a> did a very nice presentation<br />  <em>Recommend</em> watching</p>
</li>
<li><p><a target="_blank" href="https://fosdem.org/2023/schedule/event/postgresql_how_to_give_your_postgres_blog_posts_an_outsize_impact/">How to Give Your Postgres Blog Posts an Outsize Impact</a>: This was out of my comfort zone. It's about how to write and promote your tech blog. I realized that it is a lot of work. Thanks <a target="_blank" href="https://hachyderm.io/@clairegiordano/">Claire</a><br />  <em>Recommend</em> watching</p>
</li>
<li><p><a target="_blank" href="https://fosdem.org/2023/schedule/event/postgresql_when_it_all_goes_right/">When it all GOes right</a>: Talk about PGX driver, a PostgreSQL driver for Golang. I liked <a target="_blank" href="https://github.com/pashagolub">Pavlo's</a> presentation, he has a style.<br />  <em>Recommend</em> watching</p>
</li>
<li><p><a target="_blank" href="https://fosdem.org/2023/schedule/event/postgresql_deep_dive_into_query_performance/">Deep Dive Into Query Performance</a>: Talk from <a target="_blank" href="https://peterzaitsev.com/">Peter Zaitsev</a> the co-founder or Percona.</p>
</li>
<li><p><a target="_blank" href="https://fosdem.org/2023/schedule/event/postgresql_dont_do_this/">Don't Do This</a>: This talk has a clickbait title but in my opinion, was the most interesting talk I attended this year. Thanks <a target="_blank" href="https://vyruss.org/computing/">Jimmy</a></p>
<p>  Highly <em>recommend</em> watching</p>
</li>
</ul>
<p>If you have the time to watch only one talk I recommend the "Don't Do This" talk. If you have the time to watch one more then watch Claire's talk.</p>
<h2 id="heading-networking">Networking</h2>
<p>It was good that I attended the conference with two friends. I haven't seen one of them in person since 2019. This was one of the highlights for me . Additionally, I met randomly an old colleague with whom we are in contact now and then. This was great.</p>
<p>Finally, I had the chance to chat with the author of Watermill and had the opportunity to ask advice for on work-related problems from PostgreSQL hackers.</p>
<h2 id="heading-summary">Summary</h2>
<p>In conclusion, the FOSDEM conference was a rich and rewarding experience that offered insights into the latest developments in the open-source world. Attending the conference provided the opportunity to connect with a diverse community of like-minded individuals, exchange ideas, and learn from experts in the field. The conference schedule was packed with informative sessions and workshops, and it was challenging to choose which talks to attend due to the variety of interesting topics. Nevertheless, the highlights of the conference provided new perspectives, fresh ideas, and valuable insights that will stay with me for a long time. I highly recommend FOSDEM to anyone interested in the open-source community and technology.</p>
<h2 id="heading-tips">Tips</h2>
<ul>
<li><p><strong>Be Selective</strong>: With so many high-quality sessions and workshops to choose from, it's important to prioritize and select the talks that align with your professional interests and goals</p>
</li>
<li><p><strong>Be open-minded</strong>: FOSDEM provides a great opportunity to learn from others and exchange ideas, so be open-minded and willing to engage in discussions and debates with others.</p>
</li>
<li><p><strong>Try Belgium Beer:</strong> You should try a few of the Tripel Beers from Belgium</p>
</li>
</ul>
]]></content:encoded></item></channel></rss>