SharpCompress 0.18

NuGet
GitHub Release

New
* Breaking change – Remove ArchiveEncoding static class in favor of instance on OptionsArchiveEncoding is now on the Options base class. This now allows for more Encoding class options as well as a custom Func for decoding for more custom options. Being instance based avoids multi-threading issues. See https://github.com/adamhathcock/sharpcompress/blob/master/src/SharpCompress/Common/ArchiveEncoding.cs

Fixes
* LeaveStreamOpen doesn’t work with TarWriter
* If Zip file has normal file header AND a post-descriptor header AND the file is attempted to be skipped by a ZipReader, then the data is attempted to be skipped twice.
* AbstractReader.Skip() does not fully read bytes from non-local streams

.NET Core on Circle CI 2.0 using Docker and Cake

I’ve only just started with Circle 2.0, which just had it’s beta tag removed.

It’s completely Docker based which I adore. I refuse to package code any other way these days.

My goal would was to build on what I previously did on Circle CI but only use an official Microsoft .NET Core SDK docker image. Having to layer extra tools onto another image and manage that is extra work. I abhor extra work.

.circleci/config.yml

Circle 2.0 moves their YAML to a subdirectory which seems to be envogue these days so we can have lots of files for specific services!

version: 2
jobs:
  build:
    working_directory: ~/api
    docker:
      - image: microsoft/dotnet:1.1.2-sdk-jessie
    environment:
      - DOTNET_CLI_TELEMETRY_OPTOUT: 1
      - CAKE_VERSION: 0.19.1
    steps:
      - checkout
      - restore_cache:
          keys:
            - cake-{{ .Environment.CAKE_VERSION }}
      - run: ./build.sh build.cake --target=restore
      - save_cache:
          key: cake-{{ .Environment.CAKE_VERSION }}
          paths:
            - ~/api/tools
      - run: ./build.sh build.cake --target=build
      - run: ./build.sh build.cake --target=test

The hard part with Circle CI 2.0 is that caching is done pretty manually and changes aren’t auto-detected. You have to version cache keys or hashes that act as cache keys. I haven’t mastered it yet.

Ideally, I’d cache the Cake tools directory and my .nuget folder on this running image but I’m not there yet.

The big thing to note is that the image is based on the official SDK image with all the necessary build tools.

Bootstrapping Cake

So it should be easy to do this now as I already have a build.sh to execute Cake right? Nope!

The bash script uses the unzip utility that usually exists. This is needed to extract the nuget package that is downloaded. curl doesn’t exist either, by the way.

Fortunately, the dotnet cli is here. It should easily restore Cake. My new build.sh needs a csproj to restore Cake with. Since the new csproj XML is tiny, this is easy to echo into a file.

#!/usr/bin/env bash
##########################################################################
# This is the Cake bootstrapper script for Linux and OS X.
# This file was downloaded from https://github.com/cake-build/resources
# Feel free to change this file to fit your needs.
##########################################################################

# Define directories.
SCRIPT_DIR=$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )
TOOLS_DIR=$SCRIPT_DIR/tools
TOOLS_PROJ=$TOOLS_DIR/tools.csproj
CAKE_DLL=$TOOLS_DIR/Cake.CoreCLR.$CAKE_VERSION/cake.coreclr/$CAKE_VERSION/Cake.dll


# Make sure the tools folder exist.
if [ ! -d "$TOOLS_DIR" ]; then
  mkdir "$TOOLS_DIR"
fi

###########################################################################
# INSTALL CAKE
###########################################################################

if [ ! -f "$CAKE_DLL" ]; then
    echo "<Project Sdk=\"Microsoft.NET.Sdk\"><PropertyGroup><OutputType>Exe</OutputType><TargetFramework>netcoreapp1.1</TargetFramework></PropertyGroup></Project>" > $TOOLS_PROJ
    dotnet add $TOOLS_PROJ package cake.coreclr -v $CAKE_VERSION --package-directory $TOOLS_DIR/Cake.CoreCLR.$CAKE_VERSION
fi

# Make sure that Cake has been installed.
if [ ! -f "$CAKE_DLL" ]; then
    echo "Could not find Cake.exe at '$CAKE_DLL'."
    exit 1
fi

###########################################################################
# RUN BUILD SCRIPT
###########################################################################

# Start Cake
exec dotnet "$CAKE_DLL" "$@"

Note: I’ve moved the CAKE_VERSION variable out of the script to attempt to use it with CircleCI but it can easily be added back

SharpCompress 0.17 (LZip, XZ) and SharpCompress in .NET Core?

SharpCompress 0.17 released on NuGet !

New Features – Full LZip support!

To me this was a big missing hole when considering how to compress. LZMA seems to be the best in the business at the moment and lots of people want to use it. Jon Skeet provided LZip read support a while back but I only now fleshed it out for writing. Even works with tar.lz just like tar.gz.

New Features – XZ read support!

XZ is another LZMA (well, LZMA2) archive format. Not sure where it came from but the LZip author isn’t impressed and I have to say, I’m not either: Xz format inadequate for long-term archiving

The XZ support is basic and it only supports one internal “stream” (e.g. file) even though multiple are possible.

There are a couple more fixes to read on the github page

Recommendations

After poking around different file formats and compressors for a long time now, I decided to write up what I think people ought to use when considering archive formats and algorithms.

Recommended: Tar with GZip/BZip2/LZip

In general, I recommend GZip (Deflate)/BZip2 (BZip)/LZip (LZMA) as the simplicity of the formats lend to better long term archival as well as the streamability. Tar is often used in conjunction for multiple files in a single archive (e.g. .tar.gz).

Tar is aging a bit with a lot of extensions but it’s still simple. I believe there are Tar replacements but I don’t come across them often.

Not recommended: Zip

Zip is okay, but it’s a very hap-hazard format and the variation in headers and implementations makes it hard to get correct. Uses Deflate by default but supports a lot of compression methods.

Zip has been king for a while so it’s not like this format is going anyway anytime soon.

Avoid: RAR

RAR is not recommended as it’s a propriatory format and the compression is closed source. Use Tar/LZip for LZMA. I’m not up to date on how RAR vs other compressors are but its claim to fame was that it was better than Zip/DEFLATE. Probably not better than LZMA.

Avoid: 7Zip and XZ

7Zip and XZ both are overly complicated. 7Zip does not support streamable formats. XZ has known holes explained here: Xz format inadequate for long-term archiving

Use Tar/LZip for LZMA compression instead.

.NET Core and compression

Recently, there’s been discussion I’ve been looking at on the CoreFx repo. Mainly, I’ve wanted to push compressors into the core for native support but probably keep archive formats outside. Most implements don’t support forward-only.

I’ve now opened a new issue about forward-only. I may push SharpCompress all the way in the core! Who knows?

The code probably needs a bit rewrite as quality is all over the place. I’ve never been one to redo algorithms but I’ve always been proud of the unified interface.

.NET Core 1.1 building with Docker and Cake (part 2)

This is a follow up to my original post: .NET Core 1.1 building with Docker and Cake That post has a bit more detail than here.

Essentially, overtime, the build was a bit too slow. Having a build container (with Mono) being pulled on Circle CI each time was too slow.

I’ve moved away from a build container but still publish and create a Docker image.

Build Process Overview

  1. Dependencies:
    • Install dotnet SDK
    • dotnet restore via Cake
  2. Compile:
    • dotnet build via Cake
  3. Test:
    • dotnet test via Cake
  4. Deployment:
    • dotnet publish
    • Use Dockerfile to create image
    • Push built image to AWS ECS

Circle CI configuration

machine:
environment:
DOTNET_CLI_TELEMETRY_OPTOUT: 1
services:
– docker

dependencies:
pre:
– sudo sh -c 'echo "deb [arch=amd64] https://apt-mo.trafficmanager.net/repos/dotnet-release/ trusty main" > /etc/apt/sources.list.d/dotnetdev.list'
– sudo apt-key adv –keyserver hkp://keyserver.ubuntu.com:80 –recv-keys 417A0893
– sudo apt-get update
– sudo apt-get install dotnet-dev-1.0.4
override:
– ./build.sh build.cake –target=restore
cache_directories:
– ~/.nuget

compile:
override:
– ./build.sh build.cake –target=build

test:
override:
– ./build.sh build.cake –target=test

deployment:
builds:
branch: [master, dev]
commands:
– mkdir publish/
– dotnet publish src/Api.Server -f netcoreapp1.1 -c Release -o ../../publish –version-suffix ${CIRCLE_BRANCH}-${CIRCLE_BUILD_NUM}
– docker build -f Dockerfile -t server-api:latest .
– docker tag server-api:latest $AWS_ACCOUNT_ID.dkr.ecr.eu-west-1.amazonaws.com/server-api:$CIRCLE_BUILD_NUM-$CIRCLE_BRANCH
– ./push.sh

New: Build Phases

I still hang everything together with a Cake script but call the stages individually to better match the stags on Circle CI. It seems most build services work this way.

New: dotnet SDK installation

This is just a copy/paste from the https://dot.net site for Ubuntu. The current version of the SDK now is 1.0.4.

New: Caching NuGet dependencies

Circle CI and other services have a notion of caching. This was easy on Circle. I just tell it to save my .nuget directory and nuget pulls are much faster. I should figure out something better for the SDK itself. But that probably means a base docker image. Maybe this is better for Circle CI 2.0 which all docker based.

New: Branch tagging

Circle supports having the build number as well as the branch as an environment variable. Using this to tag is nicer for me as well.

Cake file

The cake file has changed since the last post too. Cake better supports the dotnet commands. I still have to manually glob for tests to run though.

var target = Argument("target", "Default");
var tag = Argument("tag", "cake");

Task("Restore")
  .Does(() =>
{
    DotNetCoreRestore(".");
});

Task("Build")
  .Does(() =>
{
    DotNetCoreBuild(".");
});

Task("Test")
  .Does(() =>
{
    var files = GetFiles("test/**/*.csproj");
    foreach(var file in files)
    {
        DotNetCoreTest(file.ToString());
    }
});

Task("Publish")
  .Does(() =>
{
    var settings = new DotNetCorePublishSettings
    {
        Framework = "netcoreapp1.1",
        Configuration = "Release",
        OutputDirectory = "../../publish",
        VersionSuffix = tag
    };

    DotNetCorePublish("src/Api.Server", settings);
});

Task("Default")
    .IsDependentOn("Restore")
    .IsDependentOn("Build")
    .IsDependentOn("Test");

RunTarget(target);

The deployment Dockerfile

FROM microsoft/dotnet:1.1.2-runtime

COPY ./publish/api /app
WORKDIR /app

EXPOSE 5000

ENTRYPOINT ["dotnet", "Api.Server.dll"]

I no longer hardcore the ASPNETCORE_ENVIRONMENT variable in the Dockerfile and put that in my ECS config in using terraform. That’s another subject though.

Publishing to AWS ECR – push.sh

I could probably fold this into the circle.yml but I like having it separate

I added a git push for tagging to my Github repo

#!/usr/bin/env bash

# more bash-friendly output for jq
JQ="jq --raw-output --exit-status"

configure_aws_cli(){
    aws --version
    aws configure set default.region eu-west-1
    aws configure set default.output json
}

push_ecr_image(){
    eval $(aws ecr get-login --region eu-west-1)
    docker push $AWS_ACCOUNT_ID.dkr.ecr.eu-west-1.amazonaws.com/server-api:$CIRCLE_BUILD_NUM-$CIRCLE_BRANCH
}

configure_aws_cli
push_ecr_image

git tag -a $CIRCLE_BUILD_NUM-$CIRCLE_BRANCH -m "Circle CI Build Tag"
git push origin --tags

SharpCompress 0.16.0 Released

Another release with some good changes. I’m still deciding on where to take this. I’m still leaving SharpCompress without a 1.0 release as I never feel confident enough to be strict with myself not to break the API in case of changes.

I have started a dotnet tool branch for fun as well as consume my own API again to get a better sense of how things feel. Take a look at the branch: dotnet tool

SharpCompress 0.16.0 on Nuget

Changelog

As always, more fixes and help are welcome!

For me to remember: .NET Core and JWT

This is a Memory Store ™ post for me to remember later. This isn’t an intro to JWT or JWT with .NET Core. Here’s some better links for that:

Check the official docs for more about JWT Bearer auth or ASP.NET Core identity in general.

This post is more “this is how I did it because it still felt unclear after reading things.”

I did this while creating the Realworld Sample for ASP.NET Core

Essentially, the JWT Bearer library that is provided for ASP.NET Core handles all of the checking of a JWT token if it’s on the Authorization header as Bearer. Which is great. Just need to hook it up:

(JwtIssuerOptions is covered later)

public static void AddJwt(this IServiceCollection services)
{
    //using options with JwtIssuerOptions
    services.AddOptions();

    //store this key somewhere else!
    var signingKey = new SymmetricSecurityKey(Encoding.ASCII.GetBytes("somethinglongerforthisdumbalgorithmisrequired"));
    services.Configure<JwtIssuerOptions>(options =>
    {
        //change this value!
        options.Issuer = "issuer";
        //change this value!
        options.Audience = "Audience";
        options.SigningCredentials = new SigningCredentials(signingKey, SecurityAlgorithms.HmacSha256);
    });
}

public static void UseJwt(this IApplicationBuilder app)
{
    var options = app.ApplicationServices.GetRequiredService<IOptions<JwtIssuerOptions>>();

    var tokenValidationParameters = new TokenValidationParameters
    {
        // The signing key must match!
        ValidateIssuerSigningKey = true,
        IssuerSigningKey = options.Value.SigningCredentials.Key,
        // Validate the JWT Issuer (iss) claim
        ValidateIssuer = true,
        ValidIssuer = options.Value.Issuer,
        // Validate the JWT Audience (aud) claim
        ValidateAudience = true,
        ValidAudience = options.Value.Audience,
        // Validate the token expiry
        ValidateLifetime = true,
        // If you want to allow a certain amount of clock drift, set that here:
        ClockSkew = TimeSpan.Zero
    };

    app.UseJwtBearerAuthentication(new JwtBearerOptions
    {
        AutomaticAuthenticate = true,
        AutomaticChallenge = true,
        TokenValidationParameters = tokenValidationParameters,
        AuthenticationScheme = JwtIssuerOptions.Scheme
    });
}

This hooks JWT into your Startup. Easy peasy. What’s not easy peasy was understanding how a person logs in JWT and manages claims for ASP.NET Core Identity.

The options used for ASP.NET Core JWT need to used for generating the JWT tokens. This was lifted from one of the above links and it works well.

public class JwtIssuerOptions
{
    public const string Scheme = "Token";

    /// <summary>
    /// "iss" (Issuer) Claim
    /// </summary>
    /// <remarks>The "iss" (issuer) claim identifies the principal that issued the
    ///   JWT.  The processing of this claim is generally application specific.
    ///   The "iss" value is a case-sensitive string containing a StringOrURI
    ///   value.  Use of this claim is OPTIONAL.</remarks>
    public string Issuer { get; set; }

    /// <summary>
    /// "sub" (Subject) Claim
    /// </summary>
    /// <remarks> The "sub" (subject) claim identifies the principal that is the
    ///   subject of the JWT.  The claims in a JWT are normally statements
    ///   about the subject.  The subject value MUST either be scoped to be
    ///   locally unique in the context of the issuer or be globally unique.
    ///   The processing of this claim is generally application specific.  The
    ///   "sub" value is a case-sensitive string containing a StringOrURI
    ///   value.  Use of this claim is OPTIONAL.</remarks>
    public string Subject { get; set; }

    /// <summary>
    /// "aud" (Audience) Claim
    /// </summary>
    /// <remarks>The "aud" (audience) claim identifies the recipients that the JWT is
    ///   intended for.  Each principal intended to process the JWT MUST
    ///   identify itself with a value in the audience claim.  If the principal
    ///   processing the claim does not identify itself with a value in the
    ///   "aud" claim when this claim is present, then the JWT MUST be
    ///   rejected.  In the general case, the "aud" value is an array of case-
    ///   sensitive strings, each containing a StringOrURI value.  In the
    ///   special case when the JWT has one audience, the "aud" value MAY be a
    ///   single case-sensitive string containing a StringOrURI value.  The
    ///   interpretation of audience values is generally application specific.
    ///   Use of this claim is OPTIONAL.</remarks>
    public string Audience { get; set; }

    /// <summary>
    /// "nbf" (Not Before) Claim (default is UTC NOW)
    /// </summary>
    /// <remarks>The "nbf" (not before) claim identifies the time before which the JWT
    ///   MUST NOT be accepted for processing.  The processing of the "nbf"
    ///   claim requires that the current date/time MUST be after or equal to
    ///   the not-before date/time listed in the "nbf" claim.  Implementers MAY
    ///   provide for some small leeway, usually no more than a few minutes, to
    ///   account for clock skew.  Its value MUST be a number containing a
    ///   NumericDate value.  Use of this claim is OPTIONAL.</remarks>
    public DateTime NotBefore => DateTime.UtcNow;

    /// <summary>
    /// "iat" (Issued At) Claim (default is UTC NOW)
    /// </summary>
    /// <remarks>The "iat" (issued at) claim identifies the time at which the JWT was
    ///   issued.  This claim can be used to determine the age of the JWT.  Its
    ///   value MUST be a number containing a NumericDate value.  Use of this
    ///   claim is OPTIONAL.</remarks>
    public DateTime IssuedAt => DateTime.UtcNow;

    /// <summary>
    /// Set the timespan the token will be valid for (default is 5 min/300 seconds)
    /// </summary>
    public TimeSpan ValidFor { get; set; } = TimeSpan.FromMinutes(5);

    /// <summary>
    /// "exp" (Expiration Time) Claim (returns IssuedAt + ValidFor)
    /// </summary>
    /// <remarks>The "exp" (expiration time) claim identifies the expiration time on
    ///   or after which the JWT MUST NOT be accepted for processing.  The
    ///   processing of the "exp" claim requires that the current date/time
    ///   MUST be before the expiration date/time listed in the "exp" claim.
    ///   Implementers MAY provide for some small leeway, usually no more than
    ///   a few minutes, to account for clock skew.  Its value MUST be a number
    ///   containing a NumericDate value.  Use of this claim is OPTIONAL.</remarks>
    public DateTime Expiration => IssuedAt.Add(ValidFor);

    /// <summary>
    /// "jti" (JWT ID) Claim (default ID is a GUID)
    /// </summary>
    /// <remarks>The "jti" (JWT ID) claim provides a unique identifier for the JWT.
    ///   The identifier value MUST be assigned in a manner that ensures that
    ///   there is a negligible probability that the same value will be
    ///   accidentally assigned to a different data object; if the application
    ///   uses multiple issuers, collisions MUST be prevented among values
    ///   produced by different issuers as well.  The "jti" claim can be used
    ///   to prevent the JWT from being replayed.  The "jti" value is a case-
    ///   sensitive string.  Use of this claim is OPTIONAL.</remarks>
    public Func<Task<string>> JtiGenerator =>() => Task.FromResult(Guid.NewGuid().ToString());

    /// <summary>
    /// The signing key to use when generating tokens.
    /// </summary>
    public SigningCredentials SigningCredentials { get; set; }
}

Now you use the above options to generate tokens! When do you generate tokens? On login! Use JwtTokenGenerator to create tokens for users.

JWT Bearer Authentication leaves it up to the caller on managing the token (as opposed to using browser cookies) so the token can just be put in the response of the login.

public class JwtTokenGenerator : IJwtTokenGenerator
{
    private readonly JwtIssuerOptions _jwtOptions;

    public JwtTokenGenerator(IOptions<JwtIssuerOptions> jwtOptions)
    {
        _jwtOptions = jwtOptions.Value;
    }

    //TODO: use something not custom
    private static long ToUnixEpochDate(DateTime date) => (long) Math.Round((date.ToUniversalTime() - new DateTimeOffset(1970, 1, 1, 0, 0, 0, TimeSpan.Zero)).TotalSeconds);

    public async Task<string> CreateToken(string username)
    {
        var claims = new[]
        {
            new Claim(JwtRegisteredClaimNames.Sub, username),
            new Claim(JwtRegisteredClaimNames.Jti, await _jwtOptions.JtiGenerator()),
            new Claim(JwtRegisteredClaimNames.Iat,
                ToUnixEpochDate(_jwtOptions.IssuedAt).ToString(),
                ClaimValueTypes.Integer64)
        };
        var jwt = new JwtSecurityToken(
            issuer: _jwtOptions.Issuer,
            audience: _jwtOptions.Audience,
            claims: claims,
            notBefore: _jwtOptions.NotBefore,
            expires: _jwtOptions.Expiration,
            signingCredentials: _jwtOptions.SigningCredentials);

        var encodedJwt = new JwtSecurityTokenHandler().WriteToken(jwt);
        return encodedJwt;
    }
}

It seems, putting username in the JwtRegisteredClaimNames.Sub claim will put it as the ClaimTypes.NameIdentifier claim. So the below code will return the authenticated user name after the request as been validated:

public string GetCurrentUsername()
{
    return _httpContextAccessor.HttpContext.User?.Claims?.FirstOrDefault(x => x.Type == ClaimTypes.NameIdentifier)?.Value;
}

So now we can validate JWT tokens and get the user name of the token once validated. Other claims can be added to the token as well and accessed later. Let the fun begin!

Just annotate your controllers or controller methods standard ASP.NET Core Identity and it works:

[Authorize(ActiveAuthenticationSchemes = JwtIssuerOptions.Scheme)]

Generating URL slugs in .NET Core

Updated: 5/5/17

  • Better handling of diacritics in sample

I’ve just discovered what a Slug is:

Some systems define a slug as the part of a URL that identifies a page in human-readable keywords.

It is usually the end part of the URL, which can be interpreted as the name of the resource, similar to the basename in a filename or the title of a page. The name is based on the use of the word slug in the news media to indicate a short name given to an article for internal use.

I needed to know this as I’m particapting in the Realworld example projects and I’m doing a back end for ASP.NET Core.

The API spec kept saying slug, and I had a moment of “ohhh, that’s what that is.” Anyway, I needed to be able to generate one. Stackoverflow to the rescue!: https://stackoverflow.com/questions/2920744/url-slugify-algorithm-in-c

Also, decoding random characters from a lot of languages isn’t straight forward so I used one of the best effort implementations from the linked SO page: https://stackoverflow.com/questions/249087/how-do-i-remove-diacritics-accents-from-a-string-in-net

Now, here’s my Slug generator:

//https://stackoverflow.com/questions/2920744/url-slugify-algorithm-in-c
//https://stackoverflow.com/questions/249087/how-do-i-remove-diacritics-accents-from-a-string-in-net
public static class Slug
{
    public static string GenerateSlug(this string phrase)
    {
        string str = phrase.RemoveDiacritics().ToLower();
        // invalid chars           
        str = Regex.Replace(str, @"[^a-z0-9\s-]", "");
        // convert multiple spaces into one space   
        str = Regex.Replace(str, @"\s+", " ").Trim();
        // cut and trim 
        str = str.Substring(0, str.Length <= 45 ? str.Length : 45).Trim();
        str = Regex.Replace(str, @"\s", "-"); // hyphens   
        return str;
    }

    public static string RemoveDiacritics(this string text)
    {
        var s = new string(text.Normalize(NormalizationForm.FormD)
            .Where(c => CharUnicodeInfo.GetUnicodeCategory(c) != UnicodeCategory.NonSpacingMark)
            .ToArray());

        return s.Normalize(NormalizationForm.FormC);
    }
}