Scala Up and Running
Goal
Set up a Scala development environment and create a working project with SBT, including configuration management, testing, and packaging.
Installation
Java
## Install AdoptOpenJDK 11
## Other JDK distributions may fail at runtime with HTTPS issues
Scala
## Install SBT from official site
sbt scalaVersion
Project Setup
Structure
root/
├── build.sbt
├── project/
│ └── plugins.sbt
└── src/
├── main/
│ ├── resources/
│ │ └── app.conf
│ └── scala/bkr/data/spark/
│ ├── App.scala
│ └── AppConfig.scala
└── test/
├── resources/
│ └── App.conf
└── scala/bkr/data/spark/
└── AppConfigTests.scala
build.sbt
ThisBuild / version := "0.1.0"
ThisBuild / scalaVersion := "2.13.8"
ThisBuild / organization := "gs"
ThisBuild / scalacOptions ++= Seq("-unchecked", "-deprecation")
lazy val KafkaStreamProcessing = (project in file("."))
.settings(
name := "SparkProcessing",
libraryDependencies ++= List("org.apache.spark" %% "spark-core" % "3.2.0",
"org.apache.spark" %% "spark-sql" % "3.2.0",
"org.apache.spark" %% "spark-sql-kafka-0-10" % "3.2.0",
"org.apache.spark" %% "spark-avro" % "3.2.0",
"org.apache.hadoop" % "hadoop-common" % "3.3.1",
"org.apache.hadoop" % "hadoop-azure" % "3.3.1"),
libraryDependencies += "io.confluent" % "kafka-schema-registry-client" % "7.0.0" from "https://packages.confluent.io/maven/io/confluent/kafka-schema-registry-client/7.0.0/kafka-schema-registry-client-7.0.0.jar",
libraryDependencies += "org.scalatest" %% "scalatest" % "3.0.8" % Test,
libraryDependencies += "com.typesafe" % "config" % "1.4.1",
assembly / assemblyJarName := s"${name.value}.jar",
assembly / assemblyMergeStrategy := {
case PathList("META-INF", xs @ _*) => MergeStrategy.discard
case _ => MergeStrategy.first
}
)
project/plugins.sbt
addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.15.0")
addSbtPlugin("net.virtual-void" % "sbt-dependency-graph" % "0.9.2")
AppConfig.scala
package bkr.data.spark
import com.typesafe.config.ConfigFactory
object AppConfig {
private val environment: String = {
val env = sys.env.get("env")
env.getOrElse("local")
}
private val appConfig = ConfigFactory.load(s"app.$environment.conf")
def apply() = appConfig
}
App.scala
package bkr.scala.upAndRunning
object Application extends App {
if (args.length == 0) throw new Exception("No arguments specified")
val url = args(0)
val response = scala.io.Source.fromURL(url).mkString
println(response)
}
AppConfigTests.scala
import org.scalatest.funsuite._
class AppConfigTests extends AnyFunSuite {
test("Hello should start with H") {
assert("Hello".startsWith("H"))
}
}
.gitignore
target/
SBT Commands
Basic Operations
sbt compile
sbt reload
sbt test
sbt test:compile
sbt run
sbt projects
sbt dependencyTree
Run with Arguments
sbt "run arg0Value arg1Value"
Subproject Commands
sbt [SUBPROJECT_NAME]/compile
## Example:
sbt helloCore/compile
Interactive Shell
sbt
## Opens SBT shell, then run commands like:
console # Opens Scala REPL
:q # Exit Scala REPL
Packaging
ZIP Distribution
sbt dist
Unzip and run:
cd publish
.\bin\hello
With custom config:
/bin/hello -Dconfig.file=/full/path/to/conf/application.prod.conf
Fat JAR
sbt assembly
The JAR is created in target/scala-2.13/[ProjectName].jar:
java -jar target/scala-2.13/SparkProcessing.jar arg0Value
Dockerize
FROM hseeberger/scala-sbt:8u302_1.5.5_2.13.6
WORKDIR /app
COPY . ./
RUN sbt compile
ENTRYPOINT ["sbt", "run"]