Packages

  • package root
    Definition Classes
    root
  • package org
    Definition Classes
    root
  • package apache
    Definition Classes
    org
  • package spark

    Core Spark functionality.

    Core Spark functionality. org.apache.spark.SparkContext serves as the main entry point to Spark, while org.apache.spark.rdd.RDD is the data type representing a distributed collection, and provides most parallel operations.

    In addition, org.apache.spark.rdd.PairRDDFunctions contains operations available only on RDDs of key-value pairs, such as groupByKey and join; org.apache.spark.rdd.DoubleRDDFunctions contains operations available only on RDDs of Doubles; and org.apache.spark.rdd.SequenceFileRDDFunctions contains operations available on RDDs that can be saved as SequenceFiles. These operations are automatically available on any RDD of the right type (e.g. RDD[(Int, Int)] through implicit conversions.

    Java programmers should reference the org.apache.spark.api.java package for Spark programming APIs in Java.

    Classes and methods marked with Experimental are user-facing features which have not been officially adopted by the Spark project. These are subject to change or removal in minor releases.

    Classes and methods marked with Developer API are intended for advanced users want to extend Spark through lower level interfaces. These are subject to changes or removal in minor releases.

    Definition Classes
    apache
  • package sql

    Allows the execution of relational queries, including those expressed in SQL using Spark.

    Allows the execution of relational queries, including those expressed in SQL using Spark.

    Definition Classes
    spark
  • package jdbc
    Definition Classes
    sql
  • JdbcConnectionProvider
  • JdbcDialect
  • JdbcDialects
  • JdbcType
c

org.apache.spark.sql.jdbc

JdbcDialect

abstract class JdbcDialect extends Serializable with Logging

Developer API

Encapsulates everything (extensions, workarounds, quirks) to handle the SQL dialect of a certain database or jdbc driver. Lots of databases define types that aren't explicitly supported by the JDBC spec. Some JDBC drivers also report inaccurate information---for instance, BIT(n>1) being reported as a BIT type is quite common, even though BIT in JDBC is meant for single-bit values. Also, there does not appear to be a standard name for an unbounded string or binary type; we use BLOB and CLOB by default but override with database-specific alternatives when these are absent or do not behave correctly.

Currently, the only thing done by the dialect is type mapping. getCatalystType is used when reading from a JDBC table and getJDBCType is used when writing to a JDBC table. If getCatalystType returns null, the default type handling is used for the given JDBC type. Similarly, if getJDBCType returns (null, None), the default type handling is used for the given Catalyst type.

Annotations
@DeveloperApi()
Source
JdbcDialects.scala
Linear Supertypes
Logging, Serializable, Serializable, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. JdbcDialect
  2. Logging
  3. Serializable
  4. Serializable
  5. AnyRef
  6. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new JdbcDialect()

Abstract Value Members

  1. abstract def canHandle(url: String): Boolean

    Check if this dialect instance can handle a certain jdbc url.

    Check if this dialect instance can handle a certain jdbc url.

    url

    the jdbc url.

    returns

    True if the dialect can be applied on the given jdbc url.

    Exceptions thrown

    NullPointerException if the url is null.

Concrete Value Members

  1. def alterTable(tableName: String, changes: Seq[TableChange], dbMajorVersion: Int): Array[String]

    Alter an existing table.

    Alter an existing table.

    tableName

    The name of the table to be altered.

    changes

    Changes to apply to the table.

    returns

    The SQL statements to use for altering the table.

  2. def beforeFetch(connection: Connection, properties: Map[String, String]): Unit

    Override connection specific properties to run before a select is made.

    Override connection specific properties to run before a select is made. This is in place to allow dialects that need special treatment to optimize behavior.

    connection

    The connection object

    properties

    The connection properties. This is passed through from the relation.

  3. def classifyException(message: String, e: Throwable): AnalysisException

    Gets a dialect exception, classifies it and wraps it by AnalysisException.

    Gets a dialect exception, classifies it and wraps it by AnalysisException.

    message

    The error message to be placed to the returned exception.

    e

    The dialect specific exception.

    returns

    AnalysisException or its sub-class.

  4. def compileValue(value: Any): Any

    Converts value to SQL expression.

    Converts value to SQL expression.

    value

    The value to be converted.

    returns

    Converted value.

    Annotations
    @Since( "2.3.0" )
  5. def getAddColumnQuery(tableName: String, columnName: String, dataType: String): String
  6. def getCatalystType(sqlType: Int, typeName: String, size: Int, md: MetadataBuilder): Option[DataType]

    Get the custom datatype mapping for the given jdbc meta information.

    Get the custom datatype mapping for the given jdbc meta information.

    sqlType

    The sql type (see java.sql.Types)

    typeName

    The sql type name (e.g. "BIGINT UNSIGNED")

    size

    The size of the type.

    md

    Result metadata associated with this type.

    returns

    The actual DataType (subclasses of org.apache.spark.sql.types.DataType) or null if the default type mapping should be used.

  7. def getDeleteColumnQuery(tableName: String, columnName: String): String
  8. def getJDBCType(dt: DataType): Option[JdbcType]

    Retrieve the jdbc / sql type for a given datatype.

    Retrieve the jdbc / sql type for a given datatype.

    dt

    The datatype (e.g. org.apache.spark.sql.types.StringType)

    returns

    The new JdbcType if there is an override for this DataType

  9. def getRenameColumnQuery(tableName: String, columnName: String, newName: String, dbMajorVersion: Int): String
  10. def getSchemaCommentQuery(schema: String, comment: String): String
  11. def getSchemaQuery(table: String): String

    The SQL query that should be used to discover the schema of a table.

    The SQL query that should be used to discover the schema of a table. It only needs to ensure that the result set has the same schema as the table, such as by calling "SELECT * ...". Dialects can override this method to return a query that works best in a particular database.

    table

    The name of the table.

    returns

    The SQL query to use for discovering the schema.

    Annotations
    @Since( "2.1.0" )
  12. def getTableCommentQuery(table: String, comment: String): String
  13. def getTableExistsQuery(table: String): String

    Get the SQL query that should be used to find if the given table exists.

    Get the SQL query that should be used to find if the given table exists. Dialects can override this method to return a query that works best in a particular database.

    table

    The name of the table.

    returns

    The SQL query to use for checking the table.

  14. def getTruncateQuery(table: String, cascade: Option[Boolean] = isCascadingTruncateTable): String

    The SQL query that should be used to truncate a table.

    The SQL query that should be used to truncate a table. Dialects can override this method to return a query that is suitable for a particular database. For PostgreSQL, for instance, a different query is used to prevent "TRUNCATE" affecting other tables.

    table

    The table to truncate

    cascade

    Whether or not to cascade the truncation

    returns

    The SQL query to use for truncating a table

    Annotations
    @Since( "2.4.0" )
  15. def getTruncateQuery(table: String): String

    The SQL query that should be used to truncate a table.

    The SQL query that should be used to truncate a table. Dialects can override this method to return a query that is suitable for a particular database. For PostgreSQL, for instance, a different query is used to prevent "TRUNCATE" affecting other tables.

    table

    The table to truncate

    returns

    The SQL query to use for truncating a table

    Annotations
    @Since( "2.3.0" )
  16. def getUpdateColumnNullabilityQuery(tableName: String, columnName: String, isNullable: Boolean): String
  17. def getUpdateColumnTypeQuery(tableName: String, columnName: String, newDataType: String): String
  18. def isCascadingTruncateTable(): Option[Boolean]

    Return Some[true] iff TRUNCATE TABLE causes cascading default.

    Return Some[true] iff TRUNCATE TABLE causes cascading default. Some[true] : TRUNCATE TABLE causes cascading. Some[false] : TRUNCATE TABLE does not cause cascading. None: The behavior of TRUNCATE TABLE is unknown (default).

  19. def quoteIdentifier(colName: String): String

    Quotes the identifier.

    Quotes the identifier. This is used to put quotes around the identifier in case the column name is a reserved keyword, or in case it contains characters that require quotes (e.g. space).

  20. def removeSchemaCommentQuery(schema: String): String
  21. def renameTable(oldTable: String, newTable: String): String

    Rename an existing table.

    Rename an existing table.

    oldTable

    The existing table.

    newTable

    New name of the table.

    returns

    The SQL statement to use for renaming the table.