aboutsummaryrefslogtreecommitdiff
;;; GNU Guix --- Functional package management for GNU
;;; Copyright © 2017 Ludovic Courtès <ludo@gnu.org>
;;;
;;; This file is part of GNU Guix.
;;;
;;; GNU Guix is free software; you can redistribute it and/or modify it
;;; under the terms of the GNU General Public License as published by
;;; the Free Software Foundation; either version 3 of the License, or (at
;;; your option) any later version.
;;;
;;; GNU Guix is distributed in the hope that it will be useful, but
;;; WITHOUT ANY WARRANTY; without even the implied warranty of
;;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
;;; GNU General Public License for more details.
;;;
;;; You should have received a copy of the GNU General Public License
;;; along with GNU Guix.  If not, see <http://www.gnu.org/licenses/>.

(define-module (test-workers)
  #:use-module (guix workers)
  #:use-module (ice-9 threads)
  #:use-module (srfi srfi-64))

(test-begin "workers")

(test-equal "enqueue"
  4242
  (let* ((pool   (make-pool))
         (result 0)
         (1+!    (let ((lock (make-mutex)))
                   (lambda ()
                     (with-mutex lock
                       (set! result (+ result 1)))))))
    (let loop ((i 4242))
      (unless (zero? i)
        (pool-enqueue! pool 1+!)
        (loop (- i 1))))
    (let poll ()
      (unless (pool-idle? pool)
        (pk 'busy result)
        (sleep 1)
        (poll)))
    result))

;; Same as above, but throw exceptions within the workers and make sure they
;; remain alive.
(test-equal "exceptions"
  4242
  (let* ((pool   (make-pool 10))
         (result 0)
         (1+!    (let ((lock (make-mutex)))
                   (lambda ()
                     (with-mutex lock
                       (set! result (+ result 1)))))))
    (let loop ((i 10))
      (unless (zero? i)
        (pool-enqueue! pool (lambda ()
                              (throw 'whatever)))
        (loop (- i 1))))
    (let loop ((i 4242))
      (unless (zero? i)
        (pool-enqueue! pool 1+!)
        (loop (- i 1))))
    (let poll ()
      (unless (pool-idle? pool)
        (pk 'busy result)
        (sleep 1)
        (poll)))
    result))

(test-end)
arkdown-body h6 { color: #777; font-size: 14px; } .markdown-body p, .markdown-body blockquote, .markdown-body ul, .markdown-body ol, .markdown-body dl, .markdown-body table, .markdown-body pre { margin: 15px 0; } .markdown-body hr { border: 2px solid #ccc; } .markdown-body>h2:first-child, .markdown-body>h1:first-child, .markdown-body>h1:first-child+h2, .markdown-body>h3:first-child, .markdown-body>h4:first-child, .markdown-body>h5:first-child, .markdown-body>h6:first-child { margin-top: 0; padding-top: 0; } .markdown-body a:first-child h1, .markdown-body a:first-child h2, .markdown-body a:first-child h3, .markdown-body a:first-child h4, .markdown-body a:first-child h5, .markdown-body a:first-child h6 { margin-top: 0; padding-top: 0; } .markdown-body h1+p, .markdown-body h2+p, .markdown-body h3+p, .markdown-body h4+p, .markdown-body h5+p, .markdown-body h6+p { margin-top: 0; } .markdown-body li p.first { display: inline-block; } .markdown-body ul, .markdown-body ol { padding-left: 30px; } .markdown-body ul.no-list, .markdown-body ol.no-list { list-style-type: none; padding: 0; } .markdown-body ul li>:first-child, .markdown-body ul li ul:first-of-type, .markdown-body ul li ol:first-of-type, .markdown-body ol li>:first-child, .markdown-body ol li ul:first-of-type, .markdown-body ol li ol:first-of-type { margin-top: 0px; } .markdown-body ul li p:last-of-type, .markdown-body ol li p:last-of-type { margin-bottom: 0; } .markdown-body ul ul, .markdown-body ul ol, .markdown-body ol ol, .markdown-body ol ul { margin-bottom: 0; } .markdown-body dl { padding: 0; } .markdown-body dl dt { font-size: 14px; font-weight: bold; font-style: italic; padding: 0; margin: 15px 0 5px; } .markdown-body dl dt:first-child { padding: 0; } .markdown-body dl dt>:first-child { margin-top: 0px; } .markdown-body dl dt>:last-child { margin-bottom: 0px; } .markdown-body dl dd { margin: 0 0 15px; padding: 0 15px; } .markdown-body dl dd>:first-child { margin-top: 0px; } .markdown-body dl dd>:last-child { margin-bottom: 0px; } .markdown-body blockquote { border-left: 4px solid #DDD; padding: 0 15px; color: #777; } .markdown-body blockquote>:first-child { margin-top: 0px; } .markdown-body blockquote>:last-child { margin-bottom: 0px; } .markdown-body table th { font-weight: bold; } .markdown-body table th, .markdown-body table td { border: 1px solid #ccc; padding: 6px 13px; } .markdown-body table tr { border-top: 1px solid #ccc; background-color: #fff; } .markdown-body table tr:nth-child(2n) { background-color: #f8f8f8; } .markdown-body img { max-width: 100%; -moz-box-sizing: border-box; box-sizing: border-box; } .markdown-body span.frame { display: block; overflow: hidden; } .markdown-body span.frame>span { border: 1px solid #ddd; display: block; float: left; overflow: hidden; margin: 13px 0 0; padding: 7px; width: auto; } .markdown-body span.frame span img { display: block; float: left; } .markdown-body span.frame span span { clear: both; color: #333; display: block; padding: 5px 0 0; } .markdown-body span.align-center { display: block; overflow: hidden; clear: both; } .markdown-body span.align-center>span { display: block; overflow: hidden; margin: 13px auto 0; text-align: center; } .markdown-body span.align-center span img { margin: 0 auto; text-align: center; } .markdown-body span.align-right { display: block; overflow: hidden; clear: both; } .markdown-body span.align-right>span { display: block; overflow: hidden; margin: 13px 0 0; text-align: right; } .markdown-body span.align-right span img { margin: 0; text-align: right; } .markdown-body span.float-left { display: block; margin-right: 13px; overflow: hidden; float: left; } .markdown-body span.float-left span { margin: 13px 0 0; } .markdown-body span.float-right { display: block; margin-left: 13px; overflow: hidden; float: right; } .markdown-body span.float-right>span { display: block; overflow: hidden; margin: 13px auto 0; text-align: right; } .markdown-body code, .markdown-body tt { margin: 0 2px; padding: 0px 5px; border: 1px solid #eaeaea; background-color: #f8f8f8; border-radius: 3px; } .markdown-body code { white-space: nowrap; } .markdown-body pre>code { margin: 0; padding: 0; white-space: pre; border: none; background: transparent; } .markdown-body .highlight pre, .markdown-body pre { background-color: #f8f8f8; border: 1px solid #ccc; font-size: 13px; line-height: 19px; overflow: auto; padding: 6px 10px; border-radius: 3px; } .markdown-body pre code, .markdown-body pre tt { margin: 0; padding: 0; background-color: transparent; border: none; } pre { line-height: 125%; } td.linenos .normal { color: inherit; background-color: transparent; padding-left: 5px; padding-right: 5px; } span.linenos { color: inherit; background-color: transparent; padding-left: 5px; padding-right: 5px; } td.linenos .special { color: #000000; background-color: #ffffc0; padding-left: 5px; padding-right: 5px; } span.linenos.special { color: #000000; background-color: #ffffc0; padding-left: 5px; padding-right: 5px; } .highlight .hll { background-color: #ffffcc } .highlight { background: #ffffff; } .highlight .c { color: #888888 } /* Comment */ .highlight .err { color: #a61717; background-color: #e3d2d2 } /* Error */ .highlight .k { color: #008800; font-weight: bold } /* Keyword */ .highlight .ch { color: #888888 } /* Comment.Hashbang */ .highlight .cm { color: #888888 } /* Comment.Multiline */ .highlight .cp { color: #cc0000; font-weight: bold } /* Comment.Preproc */ .highlight .cpf { color: #888888 } /* Comment.PreprocFile */ .highlight .c1 { color: #888888 } /* Comment.Single */ .highlight .cs { color: #cc0000; font-weight: bold; background-color: #fff0f0 } /* Comment.Special */ .highlight .gd { color: #000000; background-color: #ffdddd } /* Generic.Deleted */ .highlight .ge { font-style: italic } /* Generic.Emph */ .highlight .gr { color: #aa0000 } /* Generic.Error */ .highlight .gh { color: #333333 } /* Generic.Heading */ .highlight .gi { color: #000000; background-color: #ddffdd } /* Generic.Inserted */ .highlight .go { color: #888888 } /* Generic.Output */ .highlight .gp { color: #555555 } /* Generic.Prompt */ .highlight .gs { font-weight: bold } /* Generic.Strong */ .highlight .gu { color: #666666 } /* Generic.Subheading */ .highlight .gt { color: #aa0000 } /* Generic.Traceback */ .highlight .kc { color: #008800; font-weight: bold } /* Keyword.Constant */ .highlight .kd { color: #008800; font-weight: bold } /* Keyword.Declaration */ .highlight .kn { color: #008800; font-weight: bold } /* Keyword.Namespace */ .highlight .kp { color: #008800 } /* Keyword.Pseudo */ .highlight .kr { color: #008800; font-weight: bold } /* Keyword.Reserved */ .highlight .kt { color: #888888; font-weight: bold } /* Keyword.Type */ .highlight .m { color: #0000DD; font-weight: bold } /* Literal.Number */ .highlight .s { color: #dd2200; background-color: #fff0f0 } /* Literal.String */ .highlight .na { color: #336699 } /* Name.Attribute */ .highlight .nb { color: #003388 } /* Name.Builtin */ .highlight .nc { color: #bb0066; font-weight: bold } /* Name.Class */ .highlight .no { color: #003366; font-weight: bold } /* Name.Constant */ .highlight .nd { color: #555555 } /* Name.Decorator */ .highlight .ne { color: #bb0066; font-weight: bold } /* Name.Exception */ .highlight .nf { color: #0066bb; font-weight: bold } /* Name.Function */ .highlight .nl { color: #336699; font-style: italic } /* Name.Label */ .highlight .nn { color: #bb0066; font-weight: bold } /* Name.Namespace */ .highlight .py { color: #336699; font-weight: bold } /* Name.Property */ .highlight .nt { color: #bb0066; font-weight: bold } /* Name.Tag */ .highlight .nv { color: #336699 } /* Name.Variable */ .highlight .ow { color: #008800 } /* Operator.Word */ .highlight .w { color: #bbbbbb } /* Text.Whitespace */ .highlight .mb { color: #0000DD; font-weight: bold } /* Literal.Number.Bin */ .highlight .mf { color: #0000DD; font-weight: bold } /* Literal.Number.Float */ .highlight .mh { color: #0000DD; font-weight: bold } /* Literal.Number.Hex */ .highlight .mi { color: #0000DD; font-weight: bold } /* Literal.Number.Integer */ .highlight .mo { color: #0000DD; font-weight: bold } /* Literal.Number.Oct */ .highlight .sa { color: #dd2200; background-color: #fff0f0 } /* Literal.String.Affix */ .highlight .sb { color: #dd2200; background-color: #fff0f0 } /* Literal.String.Backtick */ .highlight .sc { color: #dd2200; background-color: #fff0f0 } /* Literal.String.Char */ .highlight .dl { color: #dd2200; background-color: #fff0f0 } /* Literal.String.Delimiter */ .highlight .sd { color: #dd2200; background-color: #fff0f0 } /* Literal.String.Doc */ .highlight .s2 { color: #dd2200; background-color: #fff0f0 } /* Literal.String.Double */ .highlight .se { color: #0044dd; background-color: #fff0f0 } /* Literal.String.Escape */ .highlight .sh { color: #dd2200; background-color: #fff0f0 } /* Literal.String.Heredoc */ .highlight .si { color: #3333bb; background-color: #fff0f0 } /* Literal.String.Interpol */ .highlight .sx { color: #22bb22; background-color: #f0fff0 } /* Literal.String.Other */ .highlight .sr { color: #008800; background-color: #fff0ff } /* Literal.String.Regex */ .highlight .s1 { color: #dd2200; background-color: #fff0f0 } /* Literal.String.Single */ .highlight .ss { color: #aa6600; background-color: #fff0f0 } /* Literal.String.Symbol */ .highlight .bp { color: #003388 } /* Name.Builtin.Pseudo */ .highlight .fm { color: #0066bb; font-weight: bold } /* Name.Function.Magic */ .highlight .vc { color: #336699 } /* Name.Variable.Class */ .highlight .vg { color: #dd7700 } /* Name.Variable.Global */ .highlight .vi { color: #3333bb } /* Name.Variable.Instance */ .highlight .vm { color: #336699 } /* Name.Variable.Magic */ .highlight .il { color: #0000DD; font-weight: bold } /* Literal.Number.Integer.Long */

Task solution - matrix multiplication in fortran

This repository contains the realization of the first task of the fortran programming language course at AGH in Cracov.

Directory structure

CMakeLists.txt
README.md
make_results.sh
template.gnuplot
src/
    blockmath.F90
    testing.F90
    naivemath.F90
    bettermath.F90
    main.F90
    dotmath.F90
    bettermath2.F90
res/
    dot_16
    dot_8
    mat_8
    bett2_4
    bett_4
    bett2_8
    block_4
    naiv_8
    mat_16
    bett_8
    naiv_4
    block_8
    naiv_16
    mat_4
    dot_4
    bett_16
    block_16
    bett2_16
    wykres4.pdf
    wykres8.pdf
    wykres16.pdf
    wykres4.svg
    wykres8.svg
    wykres16.svg

CmakeLists.txt

This file contains CMake definitions for building fortran sources. It does not, however, define the actions of running the program, generating results, nor plotting.

make_results.sh

This script runs built program under different options and writes measured multiplication times in res/{naiv,bett,dot,mat,bett2,block}_{4,8,16} while also running gnuplot to generate their corresponding plots into res/wykres{4,8,16}.pdf.

template.gnuplot

It's the gnuplot script for generating plots from text files in res. In this script [kind] has to be replaced with actual number (for example with the help of sed, see make_results.sh for how it's done).

src/

This directory contains the fortran sources of the project, namely modules sources *math.F90 and main program source main.F90.

res/

This directory contains text results of matrix multiplication time measuring (files without extension) and their corresponding plots (pdf files as required for the task and respective svg versions for embedding here).

Building

As a prerequisite, CMake, make and a fortran compiler, either gfortran or ifort, are required (didn't want to support proprietary compiler :c but had to).

make_results.sh assumes an out-of-source build in build/ has been performed.

$ mkdir build
$ cd build
$ cmake ../
$ make mull

The CMake configuration provided allows one to specify one of twosets of compiler flags. The default is RELEASE and it can be changed to DEBUG using

$ cmake .. -DCMAKE_BUILD_TYPE=DEBUG

Running

The executable is mull.

 $ ./mull
 Usage: mull KIND IMPLEMENTATION
 where KIND is one of: 4, 8, 16
  IMPLEMENTATION is one of: naiv, bett, dot, mat, bett2, block

When ran with proper arguments, it prints matrix sizes and multiplication times in columns.

Text files and plots in res/ can be regenerated by issuing

$ ./make_results.sh

from the main project directory (may take up to several tens of minutes).

Implementation

5 versions of matrix multiplication are implemented in separate files under src/. They're all called and their running times measured in src/main.F90 together with fortran's built-in matmul() intrinsic function.

naive

Implemented in src/naivemath.F90 the order of loops is i, j, k.

better

Implemented in src/bettermath.F90 the order of loops is j, k, i. It's expected to be faster than naive, because the innermost loop accesses array elements that are closed to each other in memory.

dot

Implemented in src/dotmath.F90 the order of loops is i, j, k. The innermost loop is replaced with dot_product() instrinsic function call, that is supposed to aid it's optimization.

better2

Implemented in src/bettermath2.F90 the order of loops is j, k, i. The innermost loop is replaced with an operation on an entire column.

block

Implemented in src/blockmath.F90. The multiplication of matrices is achieved by many multiplications of submatrices (blocks). The order of loops for one block is j, k, i with the innermost loop replaced with an operation on entire blocks' columns.

Results

KIND=4

plot kind=4

KIND=8

plot kind=8

KIND=16

plot kind=16

Conclusions

As expected, modification of algorithm to reference memory more locally improves the execution speed (difference between naive and better). This is achieved through better processor cache exploitation.

Usage of intrinsic dot_product() as well as fortran's array operations didn't really help the compiler further optimize code (almost no diference between better and better2, between naive and dot). However, the same change in a bigger fashion, mainly - application of matmul(), gave significant performance gains.

Block array multiplication is supposed to increase temporal locality of operations, thus allowing efficient use use L1 cache. Whether this method is really beneficial depends on factors like processor model. The performance gain, if it there is any, is greater for bigger matices. In this experiment this algorithm was almost aqually effective to better and better2 implementations.

The differences between algorithms' speeds are biggest for single precision numbers and almost unnoticable in case of kind 16 precision. One possible explaination of this is that operations on high precision numbers consume more processor cycles, hence 1. memory accesses have smaller impart on the overall running time of the code 2. they can be performed by processor in parallel with arithmetic operations and have a chance of completing before the FPU can start another operation