AnjLab Blog
В этом разделе транслируются сообщения, которые участники команды размещают на своих персональных блогах.
Profiling GAE API calls
While optimizing performance of GAE application its convenient to measure GAE API calls.
I'm using the following implementation of com.google.apphosting.api.ApiProxy.Delegate to do this:
by Lorenzo Bettini
http://www.lorenzobettini.it
http://www.gnu.org/software/src-highlite -->
public class ProfilingDelegate implements Delegate<Environment> {
private static final Logger logger = LoggerFactory.getLogger(ProfilingDelegate.class);
private final Delegate<Environment> parent;
private final String appPackage;
public ProfilingDelegate(Delegate<Environment> parent, String appPackage) {
this.parent = parent;
this.appPackage = appPackage;
}
public void log(Environment env, LogRecord logRec) {
parent.log(env, logRec);
}
@Override
public byte[] makeSyncCall(Environment env, String pkg, String method, byte[] request) throws ApiProxyException {
long start = System.currentTimeMillis();
byte[] result = parent.makeSyncCall(env, pkg, method, request);
StringBuilder builder = buildStackTrace(appPackage);
logger.info("GAE/S {}.{}: ->{} ms<-\n{}", new Object[] { pkg, method, System.currentTimeMillis() - start, builder });
return result;
}
/**
*
* @param appPackage
* Only classes from this package would be included in trace.
* @return
*/
public static StringBuilder buildStackTrace(String appPackage) {
StackTraceElement[] traces = Thread.currentThread().getStackTrace();
StringBuilder builder = new StringBuilder();
int length = traces.length;
StackTraceElement traceElement;
String className;
for (int i = 3; i < length; i++) {
traceElement = traces[i];
className = traceElement.getClassName();
if (className.startsWith(appPackage)) {
if (builder.length() > 0) {
builder.append('\n');
}
builder.append("..");
builder.append(className.substring(className.lastIndexOf('.')));
builder.append('.');
builder.append(traceElement.getMethodName());
builder.append(':');
builder.append(traceElement.getLineNumber());
}
}
if (builder.length() == 0) {
for (int i = 1; i < length; i++) {
traceElement = traces[i];
className = traceElement.getClassName();
if (builder.length() > 0) {
builder.append('\n');
}
builder.append(className);
builder.append('.');
builder.append(traceElement.getMethodName());
builder.append(':');
builder.append(traceElement.getLineNumber());
}
}
return builder;
}
@Override
public Future<byte[]> makeAsyncCall(Environment env, String pkg, String method, byte[] request, ApiConfig config) {
long start = System.currentTimeMillis();
Future<byte[]> result = parent.makeAsyncCall(env, pkg, method, request, config);
StringBuilder builder = buildStackTrace(appPackage);
logger.info("GAE/A {}.{}: ->{} ms<-\n{}", new Object[] { pkg, method, System.currentTimeMillis() - start, builder });
return result;
}
}
To register this delegate add the following code to prior to any API calls, i.e. to filter init() method:
by Lorenzo Bettini
http://www.lorenzobettini.it
http://www.gnu.org/software/src-highlite -->
public void init(FilterConfig config) throws ServletException
{
this.config = config;
// Note: Comment this off to profile Google API requests
ApiProxy.setDelegate(new ProfilingDelegate(ApiProxy.getDelegate(), "dmitrygusev"));
}
Here's an example of log output:
02.09.2010 0:22:19 dmitrygusev.tapestry5.gae.ProfilingDelegate makeSyncCall
INFO: GAE/S datastore_v3.BeginTransaction: ->1076 ms<-
...LazyJPATransactionManager$1.assureTxBegin:48
...LazyJPATransactionManager$1.createQuery:137
...AccountDAOImpl.findByEmail:36
...AccountDAOImpl.getAccount:26
...AccountDAOImplCache.getAccount:36
...Application.getUserAccount:395
...Application.trackUserActivity:400
...AppModule$1.service:229
...AppModule$2.service:291
...LazyTapestryFilter.doFilter:62
02.09.2010 0:22:19 dmitrygusev.tapestry5.gae.LazyJPATransactionManager$1 assureTxBegin
INFO: Transaction created (1200 ms) for context ...AccountDAOImpl.findByEmail:36
...AccountDAOImpl.getAccount:26
...AccountDAOImplCache.getAccount:36
...Application.getUserAccount:395
...Application.trackUserActivity:400
...AppModule$1.service:229
...AppModule$2.service:291
See also GAE and Tapestry5 Data Access Layer
Read more [Dmitry Gusev Blog]
GAE and Tapestry5 Data Access Layer
GAE provides two ways communicating with its datastore from Java: provides quick access to objects that were "touched" during current request allows application instances to share cached objects across entire appengine cluster
In this post I will try to explain some performance improvements of JPA usage. Of course, there's always some overhead using high-level API. But I use JPA in Ping Service and think it worth it.
Spring vs. Tapestry-JPA
Its a good practice using JPA in conjunction with IoC-container to inject EntityManager into your services. At the very beginning of development I used Spring 3.0 as IoC and for transaction management. It worked, but it takes too much time to initialize during load requests, and every time user opens its first web page, he ended with DeadlineExceededException.
Then I tried tapestry-jpa from Tynamo and it fits perfectly. It runs pretty fast and allows to:
DAO and Caching
Since GAE datastore can't operate with multiple entities in a single transaction I've added @CommitAfter annotation to every method of each DAO class.
Datastore access is a an expensive operation in GAE, so I've implemented DAO-level caching:
DAO interface
by Lorenzo Bettini
http://www.lorenzobettini.it
http://www.gnu.org/software/src-highlite -->
public interface JobDAO {
// ...
@CommitAfter
public abstract Job find(Key jobKey);
@CommitAfter
public abstract void update(Job job, boolean commitAfter);
DAO implementation
by Lorenzo Bettini
http://www.lorenzobettini.it
http://www.gnu.org/software/src-highlite -->
public class JobDAOImpl implements JobDAO {
// ...
@Override
public Job find(Key jobKey) {
return em.find(Job.class, jobKey);
}
public void update(Job job, boolean commitAfter) {
if (!em.getTransaction().isActive()){
// see Application#internalUpdateJob(Job)
logger.debug("Transaction is not active. Begin new one...");
// XXX Rewrite this to handle transactions more gracefully
em.getTransaction().begin();
}
em.merge(job);
if (commitAfter) {
em.getTransaction().commit();
}
}
DAO cache
by Lorenzo Bettini
http://www.lorenzobettini.it
http://www.gnu.org/software/src-highlite -->
public class JobDAOImplCache extends JobDAOImpl {
// ...
@Override
public Job find(Key jobKey) {
Object entityCacheKey = getEntityCacheKey(Job.class, getJobWideUniqueData(jobKey));
Job result = (Job) cache.get(entityCacheKey);
if (result != null) {
return result;
}
result = super.find(jobKey);
if (result != null) {
cache.put(entityCacheKey, result);
}
return result;
}
@Override
public void update(Job job, boolean commitAfter) {
super.update(job, commitAfter);
Object entityCacheKey = getEntityCacheKey(Job.class, getJobWideUniqueData(job.getKey()));
Job cachedJob = (Job)cache.get(entityCacheKey);
if (cachedJob != null) {
if (!cachedJob.getCronString().equals(job.getCronString())) {
abandonJobsByCronStringCache(cachedJob.getCronString());
abandonJobsByCronStringCache(job.getCronString());
}
cache.put(entityCacheKey, job);
} else {
abandonJobsByCronStringCache();
}
updateJobInScheduleCache(job);
}
Notice how update method implemented in JobDAOImplCache. If DAO method changes object in database it is responsible for updating all cached object instances in the entire cache. It may be difficult to support such implementation, on the other hand it may be very effective because you have full control over cache.
Each *DAOImplCache class uses two-level JSR-107 based cache:
Note that local memory cache should be request scoped, or it may lead to stale data across appserver instances. To reset local cache after each request it should be registered as ThreadCleanupListener:
by Lorenzo Bettini
http://www.lorenzobettini.it
http://www.gnu.org/software/src-highlite -->
public static Cache buildCache(Logger logger, PerthreadManager perthreadManager) {
try {
CacheFactory cacheFactory = CacheManager.getInstance().getCacheFactory();
Cache cache = cacheFactory.createCache(Collections.emptyMap());
LocalMemorySoftCache cache2 = new LocalMemorySoftCache(cache);
// perthreadManager may be null if we creating cache from AbstractFilter
if (perthreadManager != null) {
perthreadManager.addThreadCleanupListener(cache2);
}
return cache2;
} catch (CacheException e) {
logger.error("Error instantiating cache", e);
return null;
}
}
Here's how LocalMemorySoftCache implementation looks like:
by Lorenzo Bettini
http://www.lorenzobettini.it
http://www.gnu.org/software/src-highlite -->
public class LocalMemorySoftCache implements Cache, ThreadCleanupListener {
private final Cache cache;
private final Map<Object, Object> map;
@SuppressWarnings("unchecked")
public LocalMemorySoftCache(Cache cache) {
this.map = new SoftValueMap(100);
this.cache = cache;
}
@Override
public void clear() {
map.clear();
cache.clear();
}
@Override
public boolean containsKey(Object key) {
return map.containsKey(key)
|| cache.containsKey(key);
}
@Override
public Object get(Object key) {
Object value = map.get(key);
if (value == null) {
value = cache.get(key);
map.put(key, value);
}
return value;
}
@Override
public Object put(Object key, Object value) {
map.put(key, value);
return cache.put(key, value);
}
@Override
public Object remove(Object key) {
map.remove(key);
return cache.remove(key);
}
// ...
/**
* Reset in-memory cache but leave original cache untouched.
*/
public void reset() {
map.clear();
}
@Override
public void threadDidCleanup() {
reset();
}
}
Make Tapestry-JPA Lazy
On every request Tapestry-JPA creates new EntityManager and starts new transaction on it. And at the end of request if current transaction is still active it gets rolled back.
But if all data were taken from cache, there won't be any interaction to database. In this case EntityManager creation and transaction begin/rollback were not required. But they consumed time and another resources.
Moreover Tapestry-JPA creates EntityManagerFactory instance on application load which is very expensive, though you might not need it (because of DAO cache or simply because request isn't using datastore at all).
To avoid this I created lazy implementations of JPAEntityManagerSource, JPATransactionManager and EntityManager, you can find them here: LazyJPAEntityManagerSource and LazyJPATransactionManager.
Read more [Dmitry Gusev Blog]
GAE and background workers in Tapestry5 app
In GAE you use task queues API to implement background workers.
For instance Ping Service uses task queues to batch ping web pages according to cron schedule.
In task queue API every task considered as an HTTP request to your application.
If you have just several requests per your billing period (say ~100 per day) then using Tapestry5 to handle tasks requests is not a bad idea since this doesn't hurt billing too much.
But if you have thousands of requests (for instance Ping Service currently servers ~13K background jobs per day) using Tapestry5 for this purposes will be a problem. Why? Because of GAE load balancing policy. The thing is GAE may (and do) shut down/start up instances of your application sporadically for better utilization of its internal resources. And every time load request happens your T5 application will have to load and initialize entire app configuration again and again.
Tapestry5 page as background worker
To implement this approach you may just create new T5 page and implement worker logic in onActivate() method. In this case you have all the power of Tapestry5 (IoC, built-in services, activation context, etc.).
In Tapestry5 every page should have a template file with markup. But for background workers this would typically be files with empty/dummy markup since nobody will access these pages from browser. When I used this approach I used @Meta(Application.NO_MARKUP) annotation and override MarkupRender service that prevents normal rendering queue for pages having this annotation and returns empty content (<html></html>) to client. Here's the discussion of implementation details.
Custom filter as background worker
Using Filter API you can declare custom filter that handle task requests. In this way Tapestry5 shouldn't be involved to processing at all and there won't be any additional overhead during load requests.
by Lorenzo Bettini
http://www.lorenzobettini.it
http://www.gnu.org/software/src-highlite -->
<filter>
<filter-name>runJob</filter-name>
<filter-class>dmitrygusev.ping.filters.RunJobFilter</filter-class>
</filter>
<filter-mapping>
<filter-name>runJob</filter-name>
<url-pattern>/filters/runJob/*</url-pattern>
</filter-mapping>
The problem here is that Tapestry5 also uses Filter API to handle requests and usually declared to serve all incoming requests:
by Lorenzo Bettini
http://www.lorenzobettini.it
http://www.gnu.org/software/src-highlite -->
<filter-mapping>
<filter-name>app</filter-name>
<url-pattern>/*</url-pattern>
</filter-mapping>
To avoid loading Tapestry5 on such request I implemented LazyTapestryFilter class that checks if request is a background worker URL and ignores it.
by Lorenzo Bettini
http://www.lorenzobettini.it
http://www.gnu.org/software/src-highlite -->
public class LazyTapestryFilter implements Filter {
private static final Logger logger = LoggerFactory.getLogger(LazyTapestryFilter.class);
private Filter tapestryFilter;
private FilterConfig config;
public static FilterConfig FILTER_CONFIG;
@Override
public void init(FilterConfig config) throws ServletException
{
FILTER_CONFIG = config;
this.config = config;
// Note: Comment this off to profile Google API requests
// ApiProxy.setDelegate(new ProfilingDelegate(ApiProxy.getDelegate()));
}
@Override
public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain)
throws IOException, ServletException
{
String requestURI = ((HttpServletRequest) request).getRequestURI();
if (requestURI.startsWith("/filters/") || requestURI.equalsIgnoreCase("/favicon.ico"))
{
return;
}
if (tapestryFilter == null)
{
long startTime = System.currentTimeMillis();
logger.info("Creating Tapestry Filter...");
tapestryFilter = new TapestryFilter();
tapestryFilter.init(config);
logger.info("Tapestry Filter created and initialized ({} ms)", System.currentTimeMillis() - startTime);
}
tapestryFilter.doFilter(request, response, chain);
}
@Override
public void destroy()
{
tapestryFilter.destroy();
}
}
Its also a good idea to skip all non-tapestry requests that you usually declare in AppModule.java like this:
by Lorenzo Bettini
http://www.lorenzobettini.it
http://www.gnu.org/software/src-highlite -->
public static void contributeIgnoredPathsFilter(Configuration<String> configuration) {
// GAE filters (Admin Console)
configuration.add("/_ah/.*");
}
Note its required for Tapestry5 to ignore these paths to enable Admin Console in development server.
Worker filter implementation should initialize as late as possible, i.e. not in init() method, but in doFilter() because init() may be invoked during app server startup even if incoming request will not match that filter:
by Lorenzo Bettini
http://www.lorenzobettini.it
http://www.gnu.org/software/src-highlite -->
@Override
public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain)
throws IOException, ServletException
{
long startTime = System.currentTimeMillis();
if (emf == null) {
lazyInit();
}
// ...
}
Also note that using this approach you will have to manage transactions manually. You may consider Ping Service AbstractFilter as a reference.
Read more [Dmitry Gusev Blog]
GAE and Tapestry5 Exception Handling
Tapestry5 uses its own technique to process unhandled exceptions.
When unhandled exception occurs Tapestry5 redirects response to special error page which is responsible to display exception detail.
There is a standard error page in Tapestry5 that can be very helpful for developer if you configure your application to run in development mode. To do this you contribute SymbolConstants.PRODUCTION_MODE symbol with value "false" in your AppModule.java like this:
by Lorenzo Bettini
http://www.lorenzobettini.it
http://www.gnu.org/software/src-highlite -->
public static void contributeApplicationDefaults(
MappedConfiguration<String, String> configuration)
{
// ...
configuration.add(SymbolConstants.PRODUCTION_MODE, "false");
// ...
}
Standard error page provides you all necessary information to understand the cause of exception:
And here is how exception report looks like in production:
This is reasonable, because in production you usually don't want to display all this information to clients. But this is also not so user friendly, because it displays value of Throwable.getMessage().
Tapestry5 allows overriding standard error page with your own exception page so you can display more user friendly messages.
There's also another scenario when you don't want Tapestry5 to generate exception report, and let application server provide static HTML page with apologizes to client. This approach better suits for production, but in development mode its better to leave detailed error report as is.
To change the way Tapestry5 handles exceptions you should provide another implementation of RequestExceptionHandler. One way doing this is to decorate RequestExceptionHandler:
by Lorenzo Bettini
http://www.lorenzobettini.it
http://www.gnu.org/software/src-highlite -->
public RequestExceptionHandler decorateRequestExceptionHandler(
final Logger logger,
final Response response,
@Symbol(SymbolConstants.PRODUCTION_MODE)
boolean productionMode)
{
// Leave default implementation of RequestExceptionHandler in development mode
if (!productionMode) return null;
// Provide simple implementation that logs exception and returns
// HTTP error code which will be handled by application server
return new RequestExceptionHandler()
{
public void handleRequestException(Throwable exception) throws IOException
{
logger.error("Unexpected runtime exception", exception);
// Return HTTP error code 500
response.sendError(HttpServletResponse.SC_INTERNAL_SERVER_ERROR, null);
}
};
}
Next, add this markup to web.xml:
by Lorenzo Bettini
http://www.lorenzobettini.it
http://www.gnu.org/software/src-highlite -->
<error-page>
<error-code>500</error-code>
<location>/500.html</location>
</error-page>
Now in case of any exceptions client will see contents of 500.html.
This approach have one more advantage for GAE. Generating exception reports consumes billable CPU cycles and takes request processing time.
Saving CPU cycles is good. And there is one note about request processing time. As you may know on GAE each request have to be processed in 30 seconds. If it doesn't, then runtime raises DeadlineExceededException and gives application few hundreds of milliseconds to fail gracefully. As practice shows, default T5 RequestExceptionHandler + error report generation usually takes longer.
One more note about GAE exception handling. Since version 1.3.6 GAE allows developers declare custom static error handlers for GAE specific errors: over_quota, dos_api_denial and timeout.
In case of first two errors GAE doesn't even pass requests to application code. Timeout errors appear as a result of application code execution and (I suppose) this static error handler may conflict with RequestExceptionHandler that overrides DeadlineExceededException with HTTP error code 500.
I also want to share my implementation of over_quota.html page. I noticed free quotas got reset every day near 11am-12am Moscow Summer Time (its around 7am-8am UTC time, not sure if it the same for another applications). I thought it would be good if I include how many time is it left for GAE enabled free quotas next time. And though over_quota.html is a static page it is possible to include a peace of javascript that calculates this time in client timezone. Here is it:
by Lorenzo Bettini
http://www.lorenzobettini.it
http://www.gnu.org/software/src-highlite -->
<html>
<head>
<title>Ping Service - Over Capacity</title>
</head>
<body>
<h1>
Over Capacity
</h1>
<p>
We apologize for the inconvenience.
</p>
<p>
Service is temporary unavailable until <span id="deadline">8:00 am UTC time.</span>
<script type="text/javascript">
var element = document.getElementById("deadline");
var now = new Date();
var deadline = new Date(now.getFullYear(), now.getMonth(), now.getDate(), 8);
var timezoneOffset = now.getTimezoneOffset() / 60;
deadline.setHours(deadline.getHours() - timezoneOffset);
if (deadline <= now) {
deadline.setDate(deadline.getDate() + 1);
}
element.innerHTML = deadline.toLocaleTimeString().replace(/:00$/, "")
+ " your time ("
+ Math.round((deadline - now) / 60 / 60 / 1000)
+ " hours left).";
</script>
</p>
</body>
</html>
See also
Read more [Dmitry Gusev Blog]
Mechanize и кодировки
Read more [Yury Korolev Blog]
AnjLab.FX Scheduler for ASP.NET
If you need simple yet easy configurable scheduler in your ASP.NET application, AnjLab.FX Scheduler might be your choise.
To use AnjLab.FX Scheduler you need to do 3 simple steps:
by Lorenzo Bettini
http://www.lorenzobettini.it
http://www.gnu.org/software/src-highlite -->
public class HelloWorldTask : ICommand
{
private static readonly log4net.ILog Log = log4net.LogManager.GetLogger(typeof(HelloWorldTask));
public void Execute()
{
Log.Info("Hello World!");
}
}
by Lorenzo Bettini
http://www.lorenzobettini.it
http://www.gnu.org/software/src-highlite -->
<?xml version="1.0" encoding="UTF-8"?>
<configuration>
<configSections>
...
<section name="triggers" type="AnjLab.FX.Tasks.Scheduling.SchedulerConfigSection, AnjLab.FX"/>
...
by Lorenzo Bettini
http://www.lorenzobettini.it
http://www.gnu.org/software/src-highlite -->
<configuration>
...
<triggers>
<!--
<daily tag='restoreDB' timeOfDay='23:00'/>
<weekly tag='backupDB' timeOfDay='01:30' weekDays='monday,friday'/>
<hourly tag='delTempFiles' minutes='30'/>
<interval tag='dumpLog' interval='00:05:00'/>
<once tag='upgradeDB' dateTime='01/15/2007 23:00'/>
<monthly tag='archiveDB' monthDay='29' timeOfDay='23:00'/>
-->
<interval tag='helloworld-task' interval='00:00:10'/>
</triggers>
...
</configuration>
Here we defined named trigger "helloworld-task" to be triggered every 10 seconds.
To map your task workers you create instance of KeyedFactory and register your tasks. We propose you do this in Global.asax Application_Start method:
by Lorenzo Bettini
http://www.lorenzobettini.it
http://www.gnu.org/software/src-highlite -->
protected void Application_Start(object sender, EventArgs e)
{
// Map trigger names to task workers
var factory = new KeyedFactory<string, ICommand>();
factory.RegisterType<HelloWorldTask>("helloworld-task");
// Start up scheduler
var scheduler = new Scheduler<ICommand>(factory);
var triggers = (List<ITrigger>)ConfigurationManager.GetSection("triggers");
scheduler.RegisterTriggers(triggers.ToArray());
scheduler.Start();
}
Thats it!
Resources:
P.S.
By the way, you can also use this API to schedule your tasks in Windows.Forms applications as well.
P.P.S.
AnjLab.FX is a framework we built during development of our projects. Its continue evolving and you can use it in your applications without any restrictions.
Read more [Dmitry Gusev Blog]
C#: RefreshSection method of ConfigurationManager is not refreshing sections under debug mode
RefreshSection method of ConfigurationManager is not refreshing sections when you run application at Visual Studio. Try to run application without VS.
Read more [Nikolay Zhebrun Blog]
