All Versions
97
Latest Version
Avg Release Cycle
56 days
Latest Release
-

Changelog History
Page 10

  • v0.14.0 Changes

    • ๐Ÿš€ The primary feature in this release is "Slave Recovery" which allows restarted slaves (e.g., deploys, crashes) to reconnect with old live executors/tasks. To enable slave recovery:

      • First you need to enable checkpointing on slaves with "--checkpoint" flag.
      • Frameworks can opt in to this feature by setting "FrameworkInfo.checkpoint" when registering with the master.
      • Once a Framework opts in, a restarted slave will recover all the framework's tasks and executors. The tasks/executors stay alive through a slave restart and reconnect with the restarted slave.
      • Slave recovery also improves the reliability of delivering status updates.
    • ๐Ÿš€ The release also includes a new feature called "Resource Reservations" which allows reserving resources on a slave to particular roles (This is an experimental feature).

    • ๐Ÿš€ This release also includes a new Mesos plugin for Jenkins which allows Jenkins to dynamically launch Jenkins slaves on a Mesos cluster (This is an experimental feature).

    ๐Ÿ›  There are also several bug fixes and stability improvements.

    All Issues: ** Sub-task

    • [MESOS-548] - Upgrade angular.js to use the full angular-ui.js
    • [MESOS-549] - Change truncated IDs to show on hover
    • [MESOS-630] - Improve the performance of Master::Http::stats().

    ** ๐Ÿ› Bug

    • [MESOS-235] - Mesos daemon ignores --conf option
    • [MESOS-368] - HTTP.Endpoints test is flaky.
    • [MESOS-370] - The process based isolation module should walk the process tree to collect resource usage.
    • [MESOS-380] - Command Executor doesn't send TASK_KILLED for killed tasks.
    • [MESOS-434] - Process isolator libprocess throws exception
    • [MESOS-449] - CgroupsTests are flaky on Ubuntu
    • [MESOS-451] - Always update resources for reregistered executors.
    • [MESOS-461] - Freezer failure while in FREEZING state.
    • [MESOS-479] - SlaveRecoveryTest/0.CleanupExecutor failure.
    • [MESOS-485] - Latest trunk fails on strict aliasing on CentOS
    • [MESOS-490] - Update mesos-daemon.sh (and associated scripts) to work with new flags mechanisms.
    • [MESOS-497] - Queued tasks should be launched in the order they were received
    • [MESOS-499] - Local slave run crashes on startup
    • [MESOS-508] - Master crash due to Broken Pipe
    • [MESOS-514] - FaultToleranceTest.ReconcileIncompleteTasks is flaky
    • [MESOS-522] - ZooKeeperMasterDetectorTest.MasterDetectorExpireSlaveZKSessionNewMaster
    • [MESOS-534] - ReaperTest.TerminatedChildProcess is flaky on Jenkins.
    • [MESOS-545] - Remove hack in post-reviews.py for tracking parent branch
    • [MESOS-582] - HTTP.Endpoints is flaky
    • [MESOS-594] - Add CXXFLAGS='-fno-strict-aliasing' if using gcc 4.4.*.
    • [MESOS-597] - Set MESOS_NATIVE_LIBRARY or (DY)LD_LIBRARY_PATH before launching an executor in order to enable JVM based executors to easily find libmesos.so.
    • [MESOS-599] - Make sure stderr/stdout get launcher output.
    • [MESOS-607] - Slave recovery should properly handle executors that were cleanly terminated in the previous run
    • [MESOS-611] - Refactor slave recovery to ensure slave recovers its state first
    • [MESOS-612] - Slave should not recover completed executors
    • [MESOS-614] - Master should remove checkpointing slave that gets disconnected when the new slave tries to register
    • [MESOS-616] - The Master / Slave should not store frameworks as both active and completed.
    • [MESOS-619] - Master should properly reconcile KillTasks
    • [MESOS-627] - Slave should offer total disk instead of available disk by default
    • [MESOS-628] - A non-checkpointing slave should still cleanup the latest slave symlink
    • [MESOS-632] - Executor driver should commit suicide if it cannot re-connect with a slave after a timeout
    • [MESOS-633] - Master should inform a recovered slave about frameworks that were completed
    • [MESOS-635] - Master doesn't update the task state when it generates TASK_LOST
    • [MESOS-636] - Executors under cgroups isolator die immediately when a slave dies if it has a controlling TTY attached
    • [MESOS-637] - Executor should reregister with the updates in the same order as it received them
    • [MESOS-638] - Slave should not send command executor infos to master when it reregisters
    • [MESOS-640] - Duplicate status update with same UUID crashes the slave
    • [MESOS-644] - Slave doesn't correctly handle checkpointed terminal update whose ack doesn't reach the executor
    • [MESOS-646] - Slave recovery doesn't properly handle checkpointed queued tasks
    • [MESOS-648] - Slave should properly handle partial writes of status updates
    • [MESOS-657] - SlaveRecoveryTest/1.PartitionedSlave fails with cgroups
    • [MESOS-668] - SlaveRecoveryTest/0.MultipleFrameworks flaky
    • [MESOS-671] - CgroupsIsolator does not listen for OOM events on recovered executors.
    • [MESOS-673] - Task reconciliation does not properly release executor resources.
    • [MESOS-675] - CHECK failure in the Master.
    • [MESOS-676] - Slave::reregistered LOG(FATAL)s due to being in RECOVERING state.
    • [MESOS-689] - Master incorrectly rejects tasks for long lived executors if they don't have FrameworkID set

    ** ๐Ÿ‘Œ Improvement

    • [MESOS-179] - Need to check for Python development headers
    • [MESOS-221] - New Allocators
    • [MESOS-329] - Add 'help' endpoints to libprocess.
    • [MESOS-552] - Jenkins scheduler should use the latest Mesos jar built from the repo
    • [MESOS-553] - Jenkins plugin should bundle the native Mesos library
    • [MESOS-554] - Jenkins scheduler should properly handle TASK_LOST
    • [MESOS-555] - Jenkins scheduler should reuse a Jenkins slave
    • [MESOS-557] - Upgrade to Bootstrap CSS v2.3.2
    • [MESOS-558] - Upgrade to full release of Angular JS
    • [MESOS-559] - Replace Bootstrap's JS with Angular UI Bootstrap
    • [MESOS-580] - Improve Command Executor
    • [MESOS-613] - Give better guidance when recovery fails
    • [MESOS-626] - Add the ability for example frameworks to checkpoint
    • [MESOS-634] - Make slave recovery more robust by ignoring absence of files
    • [MESOS-651] - Expose slave re-registration time in the Web UI
    • [MESOS-663] - Expose recovery errors when running recovery in --no-strict mode

    ** ๐Ÿ†• New Feature

    • [MESOS-110] - Slave Recovery: A slave restart should not restart tasks
    • [MESOS-203] - Killtree that recursively kills sessions
    • [MESOS-504] - Add weighted DRF.
    • [MESOS-505] - Add resource reservations/pools per role.
    • [MESOS-506] - Implement Jenkins scheduler for Mesos

    ** Task

    • [MESOS-643] - Revert the semantics of newly introduced changes to FrameworkReregistered messages
    • [MESOS-647] - Revert the default recovery mode to strict
  • v0.13.0 Changes

    • ๐Ÿš€ This release includes a major refactor of the internal testing infrastructure.
    • ๐Ÿ›  There are also several bug fixes and stability improvements (esp. around ZooKeeper).
    • ๐Ÿšš Hadoop on Mesos is moved out of the mesos repo to its own repository (https://github.com/mesos/hadoop).

    All Issues: ** ๐Ÿ› Bug

    • [MESOS-77] - ExceptionTest.AbortOnFrameworkError sometimes hangs if Mesos built without optimizations
    • [MESOS-201] - CppFramework test occasionally fails
    • [MESOS-217] - LOST tasks are incorrectly reconciled between mesos and framework
    • [MESOS-232] - Unit test CoordinatorTest.Elect triggers non-deterministic assertion failure in libprocess.
    • [MESOS-276] - SIGSEV with current trunk and OpenJDK 7u3
    • [MESOS-277] - Java test framework test is flaky
    • [MESOS-289] - Zookeeper tests are flaky
    • [MESOS-301] - Coordinator test is flaky
    • [MESOS-318] - os::memory does not consider sysinfo.mem_unit
    • [MESOS-321] - libprocess http::encode fails test
    • [MESOS-344] - --disable-java still uses java headers during make check
    • [MESOS-353] - ZooKeepet state test GetSetGet hung
    • [MESOS-362] - Inconsistent slave maps in the master.
    • [MESOS-365] - Slave should reject tasks before registering with the master.
    • [MESOS-366] - Master check failure during load tests.
    • [MESOS-369] - Mesos tests spitting out error messages.
    • [MESOS-379] - Zookeeper MasterDetectorExpireSlaveZKSessionNewMaster test is flaky
    • [MESOS-385] - MasterTest.TaskRunning flaky on Jenkins.
    • [MESOS-392] - FaultTolerance SchedulerExit test hangs
    • [MESOS-393] - Forking at an unlucky time on OS X can cause the C++ library to deadlock.
    • [MESOS-394] - Don't do ExecutorLauncher in forked process but exec first instead.
    • [MESOS-395] - FaultToleranceTest.SchedulerFailoverFrameworkMessage test is flaky.
    • [MESOS-399] - MonitorTest.WatchUnwatch failed.
    • [MESOS-400] - Example Java framework test is flaky
    • [MESOS-401] - SlaveRecoveryTest/0.RecoverTerminatedExecutor is flaky on OSX.
    • [MESOS-402] - CoordinatorTest.TruncateNotLearnedFill test is flaky
    • [MESOS-405] - SlaveRecoveryTest/1.ReconnectExecutor crashes.
    • [MESOS-406] - Google mock throws a segfault when invoked by TestFilter
    • [MESOS-407] - Google test filter processing is incorrect for the empty string.
    • [MESOS-408] - FaultToleranceTest.SlavePartitioned is flaky
    • [MESOS-412] - MasterTest.ShutdownUnregisteredExecutor flaky
    • [MESOS-423] - A slave asked to shutdown should not reregister with a new slave id
    • [MESOS-424] - CgroupsIsolatorTest.BalloonFramework runs forever
    • [MESOS-436] - FaultToleranceTest.SchedulerFailover test is flaky
    • [MESOS-437] - ResourceOffersTest.ResourceOfferWithMultipleSlaves is flaky
    • [MESOS-440] - Allow for headroom in the GC algorithm.
    • [MESOS-441] - AllocatorZooKeeperTest/0.FrameworkReregistersFirst is flaky
    • [MESOS-446] - Master should shutdown slaves that were deactivated
    • [MESOS-447] - Master should send TASK_LOST updates for unknown tasks when slave reregisters
    • [MESOS-450] - The master should shut down slaves upon removal.
    • [MESOS-453] - AllocatorZookeeper tests are using /tmp/mesos work directory
    • [MESOS-454] - ResourceOffers tests are using /tmp/mesos working directory
    • [MESOS-462] - Resource usage collection failure messages have '1' as the failure message.
    • [MESOS-469] - Scheduler driver should call disconnected on master failover
    • [MESOS-474] - Mesos 0.10.0: make check fails on Ubuntu 12.04LTS
    • [MESOS-476] - Upgrade libev to 4.15
    • [MESOS-481] - Slave needs to only inform the master about non-terminal executors, for proper resource accounting
    • [MESOS-482] - Status update manager should not cleanup the stream when there are pending updates, even though it received an ACK for a terminal update
    • [MESOS-484] - Latest ZooKeeperState.cpp doesn't compile on Mountain Lion
    • [MESOS-502] - Slave crashes when handling duplicate terminal updates
    • [MESOS-515] - Slave fails to detect free disk space
    • [MESOS-524] - JVM tests are flaky on OSX.
    • [MESOS-530] - A registered slave should check registration id when it receives mulitple re(re-)gistered messages from the master
    • [MESOS-535] - Master crashes when removing non-checkpointing framework from a checkpointing slave
    • [MESOS-538] - Master should not offer non-checkpointing slave's resources to checkpointing frameworks
    • [MESOS-587] - MesosNativeLibrary.java doesn't read environment variable "MESOS_NATIVE_LIBRARY".
    • [MESOS-604] - Slave should use the filesystem containing the work directory for disk usage calculation
    • [MESOS-605] - Fix wrong compression for hadoop package
    • [MESOS-606] - Slave incorrectly moves a task from 'terminatedTasks' to 'completedTasks'
    • [MESOS-609] - Executor should remove the task from queuedTasks when it moves a queued task to terminatedTasks.

    ** ๐Ÿ‘Œ Improvement

    • [MESOS-46] - Refactor MasterTest to use fixture
    • [MESOS-134] - Add Python documentation
    • [MESOS-140] - Unrecognized command line args should fail the process
    • [MESOS-242] - Add more tests to Dominant Share Allocator
    • [MESOS-305] - Inform the frameworks / slaves about a master failover
    • [MESOS-346] - Improve OSX configure output when deprecated headers are present.
    • [MESOS-360] - Mesos jar should be built for java 6
    • [MESOS-409] - Master detector code should stat nodes before attempting to create
    • [MESOS-472] - Separate ResourceStatistics::cpu_time into ResourceStatistics::cpu_user_time and ResourceStatistics::cpu_system_time.
    • [MESOS-493] - Expose version information in http endpoints
    • [MESOS-503] - Master should log LOST messages sent to the framework
    • [MESOS-526] - Change slave command line flag from 'safe' to 'strict'
    • [MESOS-602] - Allow Mesos native library to be loaded from an absolute path
    • [MESOS-603] - Add support for better test output in newer versions of autools

    ** ๐Ÿ†• New Feature

    • [MESOS-169] - Ability to for tests to catch in-flight messages that are dispatched

    ** Task

    • [MESOS-618] - Remove Hadoop on Mesos from repository in favor of external repository
    • [MESOS-643] - Revert the semantics of newly introduced changes to FrameworkReregistered messages
  • v0.12.1 Changes

    • ๐Ÿš€ This release is primarily a bug fix release with a few small features for running JVM frameworks, like Hadoop and Jenkins.

    All Issues: ** ๐Ÿ› Bug

    • [MESOS-515] - Slave fails to detect free disk space
    • [MESOS-524] - JVM tests are flaky on OSX.
    • [MESOS-587] - MesosNativeLibrary.java doesn't read environment variable "MESOS_NATIVE_LIBRARY".
    • [MESOS-597] - Set MESOS_NATIVE_LIBRARY or (DY)LD_LIBRARY_PATH before launching an executor in order to enable JVM based executors to easily find libmesos.so.
    • [MESOS-599] - Make sure stderr/stdout get launcher output.
    • [MESOS-604] - Slave should use the filesystem containing the work directory for disk usage calculation
    • [MESOS-605] - Fix wrong compression for hadoop package

    ** ๐Ÿ‘Œ Improvement

    • [MESOS-179] - Need to check for Python development headers
    • [MESOS-346] - Improve OSX configure output when deprecated headers are present.
    • [MESOS-360] - Mesos jar should be built for java 6
    • [MESOS-602] - Allow Mesos native library to be loaded from an absolute path
    • [MESOS-603] - Add support for better test output in newer versions of autools
  • v0.12.0 Changes

    • ๐Ÿš€ This release includes bug fixes and stability improvements.
    • ๐Ÿš€ The primary feature in this release is executor resource consumption monitoring. Slaves now monitor resource consumption of running executors and expose it over JSON and through the webui.
    • ๐Ÿš€ This release also includes a new and improved Hadoop framework. The new port doesn't require patching Hadoop (i.e., it's a self contained Hadoop contrib) and lets you use existing schedulers (e.g., the fair scheduler or capacity scheduler)! The tradeoff, however, is that it doesn't take as much advantage of the fine-grained nature of a Mesos task (i.e., there is no longer a 1-1 mapping between a Mesos task and a map/reduce task).

    All Issues: ** Sub-task

    • [MESOS-214] - Report resources being used by executors
    • [MESOS-419] - Old slave directories should not be garbage collected based on file modification time

    ** ๐Ÿ› Bug

    • [MESOS-107] - Scheduler library should not acknowledge a status update if the driver has been aborted.
    • [MESOS-152] - Slave should forward status updates for unknown tasks
    • [MESOS-285] - configure.macosx checks for version "10.7" but should check for 10.7 or greater
    • [MESOS-307] - Web UI file download links are broken.
    • [MESOS-317] - python mesos core bindings rejects framework messages with null bytes
    • [MESOS-319] - Fix buggy read / write calls.
    • [MESOS-325] - make clean is broken
    • [MESOS-332] - Executor launcher fetches resources as slave user, instead of executor user.
    • [MESOS-340] - Gperftools target always rebuilds.
    • [MESOS-374] - HTTP GET requests to /statistics/snapshot.json crash the slave
    • [MESOS-422] - Master leader election should be more robust to stale ephemeral nodes
    • [MESOS-486] - TaskInfo should include a 'source' in order to enable getting resource monitoring statistics.

    ** ๐Ÿ‘Œ Improvement

    • [MESOS-293] - Make clean deletes checked in files.

    ** ๐Ÿ†• New Feature

    • [MESOS-99] - display slave resource usage information in the slave webui

    ** Task

    • [MESOS-274] - Unicode / Binary files over http endpoints.
    • [MESOS-324] - Monitor executor resource usage.
    • [MESOS-331] - Add --disable-perftools to configure.
  • v0.11.0 Changes

    ** Brainstorming * [MESOS-357] - participate in GSoC 2013 * [MESOS-358] - do not participate in the GSoC 2013

    All Issues: ** ๐Ÿ› Bug

    • [MESOS-260] - Implement a duration abstraction
    • [MESOS-261] - bootstrap fails when automake version >= 1.12
    • [MESOS-263] - Complete the new webui (slave, framework, executor pages)
    • [MESOS-264] - Make fails on the latest ubuntu
    • [MESOS-270] - Log viewing broken on mesos-local runs
    • [MESOS-364] - cgroup tests fail on Ubuntu
    • [MESOS-386] - AllocatorTest/0.TaskFinished has incomplete expectations.
    • [MESOS-388] - Latest update breaks building on OSX
    • [MESOS-404] - FilesTest.BrowseTest is flaky
    • [MESOS-413] - AllocatorTest/0.TaskFinished test has bad expectations.

    ** ๐Ÿ‘Œ Improvement

    • [MESOS-252] - Web UI Improvements
    • [MESOS-253] - Enable -Wall -Werror on the build
    • [MESOS-254] - Improve mesos slave's garbage collection
    • [MESOS-259] - Expose slave attributes in slave endpoint & sortable tables

    ** Task

    • [MESOS-275] - HTTP endpoint for file download.
    • [MESOS-279] - Impose a limit on HTTP Response size
    • [MESOS-389] - Add OSX slave to the Jenkins build.
    • [MESOS-491] - Add mesos-0.11.0-incubating jar to maven central
  • v0.10.0 Changes

    All Issues: ** Sub-task

    • [MESOS-89] - create utilities to collect information from the proc filesystem
    • [MESOS-212] - Eliminate Bottle and Python based webui.
    • [MESOS-222] - Rename SimpleAllocator to DominantShareAllocator
    • [MESOS-223] - Libprocess-ify Allocator
    • [MESOS-224] - Write allocator tests

    ** ๐Ÿ› Bug

    • [MESOS-17] - Hadoop executors killed while tasks in COMMIT_PENDING
    • [MESOS-34] - Rendering JSON needs to escape strings properly.
    • [MESOS-44] - Master Detector uses the wrong ACL when auth is not required
    • [MESOS-48] - Remove the failover flag from executor driver
    • [MESOS-54] - Mesos ZooKeeper authentication is broken
    • [MESOS-83] - Filters should be removed when more resources are available than rejected offer had
    • [MESOS-145] - mesos executor holds on to fd spawned by slave after slave death, preventing slave from restarting
    • [MESOS-148] - Building of included Hadoop broken
    • [MESOS-164] - Crash on Mac OS X due to dlopen not being thread-safe
    • [MESOS-183] - Included MPI Framework Fails to Start
    • [MESOS-187] - Mesos should not pass an invalid task to a slave
    • [MESOS-190] - Slave seg fault when executor exited
    • [MESOS-199] - Attempting to use killtree.sh after forked pid has died is fruitless.
    • [MESOS-200] - New Linux 'proc' code assuming too new a kernel.
    • [MESOS-209] - A race bug in ProcessManager::spawn in libprocess.
    • [MESOS-211] - Fix Slave GC tests
    • [MESOS-218] - Master throws exception on removeTask() if Framework is not connected
    • [MESOS-220] - Slave throws exception in librocess when master is down
    • [MESOS-229] - mesos zookeeper group code fails to connect when pre-existing children of the group path are read-only
    • [MESOS-233] - port should not be of type short
    • [MESOS-239] - Allocator doesn't handle framework failover correctly
    • [MESOS-244] - Mesos slave process is not shutting down cleanly
    • [MESOS-247] - Ranges comparison has a bug
    • [MESOS-248] - Python quit unexpectedly while using mesos plugin
    • [MESOS-251] - DRF allocator doesn't expire filter correctly
    • [MESOS-257] - Master doesn't recover resources when executor exits
    • [MESOS-262] - Slave should not charge the resources required for launching a executor against the executor
    • [MESOS-266] - When the master removes a slave, a shutdown should be sent.
    • [MESOS-268] - Slave should force kill executors when it is shutting down
    • [MESOS-278] - Master Fails to Connect to Zookeeper
    • [MESOS-284] - Short-term fix for fire-walling slave shutdown and lost slave messages from the 'wrong' master
    • [MESOS-286] - AllocatorTest is flaky
    • [MESOS-288] - Latest trunk does not finish make check
    • [MESOS-290] - Jobtracker can't get TaskTrackerInfo when the JobTracker log file is deleted
    • [MESOS-299] - Master detector doesn't notify about leading master after network disconnection
    • [MESOS-302] - Scheduler driver shouldn't send an ACK if the driver is aborted while sending stats update
    • [MESOS-303] - mesos slave crashes during framework termination
    • [MESOS-306] - Mesos-master frequently crashes
    • [MESOS-310] - cgroups isolation module should not block on fetching executors
    • [MESOS-339] - Release script is expecting enter, not any key
    • [MESOS-382] - FaultToleranceTest.FrameworkReliableRegistration test is flaky.
    • [MESOS-383] - AllocatorTest/0.FrameworkExited test is broken
    • [MESOS-464] - mesos 0.10.0 fails to build on ubutu 13.04

    ** ๐Ÿ‘Œ Improvement

    • [MESOS-8] - Maintain a history of executed frameworks/tasks and show it on the web UI
    • [MESOS-57] - Submit mesos.jar to Maven Central
    • [MESOS-142] - Explicitly set and clean test work directories
    • [MESOS-149] - Garbage collection on slaves
    • [MESOS-171] - Make CommandInfo 'uri' field be repeated, possibly making a URI embedded message to describe whether or not we should 'chmod +x' the resulting resource.
    • [MESOS-180] - Update the Hadoop patch to list protobuf-2.4.1 as a dependency so Maven pulls it down.
    • [MESOS-193] - Create a single-page javascript interface to replace the existing webui
    • [MESOS-194] - Make killtree more verbose.
    • [MESOS-255] - Expose files through HTTP endpoints.
    • [MESOS-256] - Introduce a cluster name into the Web UI
    • [MESOS-272] - Create a 'fs' namespace and migrate as appropriate from our 'os' namespace.

    ** ๐Ÿ†• New Feature

    • [MESOS-86] - Expose master url to the scheduler
    • [MESOS-158] - Make ExecutorInfo more rich
    • [MESOS-185] - Provide a master stat indicating number of outstanding resource offers
    • [MESOS-207] - A new isolation module on Linux that uses Linux control groups (cgroups) directly.
    • [MESOS-208] - Add whitelist option to master

    ** Question

    • [MESOS-258] - mesos-master / mesos-slave => error: [Errno 32] Broken pipe
    • [MESOS-298] - Executor fails to start
    • [MESOS-311] - ClassNotFoundException when deploying hadoop on mesos

    ** Task

    • [MESOS-69] - Migrate to Apache wiki and de-activate github wiki
    • [MESOS-80] - Add a page or section to the wiki defining project coding standards
    • [MESOS-133] - Make Mesos clean of most GCC warnings
    • [MESOS-398] - Add mesos-0.10.0-incubating jar to maven central
  • v0.9.0 Changes

    ** โฌ†๏ธ Dependency upgrade * [MESOS-174] - Upgrade protobuf dependency to version 2.4.1

    ** ๐Ÿ‘Œ Improvement * [MESOS-3] - Ask executors to shutdown when a framework goes away * [MESOS-167] - Make API names consistent

    ** ๐Ÿ†• New Feature * [MESOS-146] - EC2 scripts should find the latest AMI from a known URL